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USE OF COMPUTATIONALLY DERIVED PROTEIN STRUCTURES OF 
GENETIC POLYMORPHISMS IN PHARMACOGENOMICS AND CLINICAL 

APPLICATIONS 

RELATED APPLICATIONS 

This application is a continuation-in-part of 
U.S. application Serial No. 09/438,566 to Kalyanaraman Ramnarayan, 
Edward T. Maggio and P. Patrick Hess, filed November 10, 1999 entitled 
"USE OF COMPUTATIONALLY DERIVED PROTEIN STRUCTURES OF 
GENETIC POLYMORPHISMS IN PHARMACOGENOMICS FOR DRUG 
DESIGN AND CLINICAL APPLICATIONS"; and U.S. application Serial No. 
(Attorney Dkt. No. 24737-1 906B) to Kalyanaraman Ramnarayan, Edward 
T. Maggio and P. Patrick Hess, filed November 1, 2000, entitled "USE OF 
COMPUTATIONALLY DERIVED PROTEIN STRUCTURES OF GENETIC 
POLYMORPHISMS IN PHARMACOGENOMICS FOR DRUG DESIGN AND 
CLINICAL APPLICATIONS." U.S. application Serial No. (Attorney Dkt. 
No. 24737-1 906B) is a continuation of U.S. application Serial No. 
09/438,566. The above-noted applications are incorporated by reference 
in their entirety. 

Incorporation by reference of Tables provided on Compact Disks 

An electronic version on compact disk (CD) ROM of Tables 4 and 
5, which set forth coordinates for three-dimensional structures of proteins 
in the database described herein is filed herewith. The contents thereof is 
incorporated by reference in its entirety. Table 4 is the HIV reverse 
transcriptase coordinates, and Table 5 is the HIV protease coordinates. 
The files that contain Table 4 are entitled 1906CTAB.001 and 
1906CTAB.002, created on November 10, 2000, and are 59,538 
kilobytes and 304 kilobytes, respectively. The file that contains Table 5 is 
entitled 1 906CTAB.003, created on November 10, 2000, and contains 
1 1,413 kilobytes. 
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FIELD OF THE INVENTION 

The present invention is related to computer-based methods and 
relational databases that use three-dimensional (3-D) protein structural 
models derived from genetic polymorphisms in the areas of computer- 
assisted drug design and the prediction of clinical responses in patients. 
BACKGROUND OF THE INVENTION 

Recent advances in molecular biology, such as the discovery and 
identification of large numbers of genes and the sequences thereof 
encoded in the genomes of humans, other mammals and infectious 
disease agents, have contributed to the identification of a large number of 
proteins, biological receptors and other macromolecules and complexes 
that are promising therapeutic targets. Based on the information derived 
from the gene sequences, the three-dimensional (3-D) molecular 
structures of the corresponding target proteins or receptors can be 
determined. 

Since 3-D protein structure is related to biological function, 
structure-based drug design is an increasingly useful methodology that 
has made a great impact in the design of biologically active lead 
compounds. Drug designers can design and screen potential new drugs 
via computational methods, such as docking or binding studies, before 
actually beginning patient testing. These experiments can be performed 
in si/fco at a tiny fraction of the clinical cost. 

The resulting molecules, while serving as lead compounds, often 
have unpredictable effects when employed in clinical trials. In addition, it 
has been observed that existing drugs with known clinical efficacy far 
often fail to achieve beneficial results when given to particular patients, or 
particular subpopulations, such as ethnic groups, of patients. Genetic 
stratification of a population can be the difference between drug failure 
and drug approval. Hence there is a need to develop methods to improve 
the drug discovery process. Therefore, it is an object herein to 
provide, among a variety of benefits, methods and products that address 



24737-1 906C 



and solve these problems. In particular, it is an object herein to provide 
computationally-based methods for drug design, clinical testing protocols, 
identification of new drug candidates and drug therapies; for predicting 
drug sensitivity and resistance and other methods. 
SUMMARY OF THE INVENTION 

Provided herein are computer-based methods for generating and 
using three-dimensional (3-D) structural models of target biomolecules, 
particularly polymorphic and allelic variants. Also provided herein are 
databases that contain the sequences of such variants and also the 3-D 
structure of the variants for use with the methods. 

Genetic polymorphisms arise, for example, as a result of gene 
sequence differences or as a result of post-translational modifications, 
including glycosylation. Hence genetic polymorphisms are manifested as 
gene products and proteins having variant structures. The variant 
structures result in differences in biological responses among the 
originating organisms. These differences in response, include, but are not 
limited to, differences among patient responses to a particular drug, 
effective dosage differences, and side effects. With respect to infectious 
organisms, some polymorphisms may arise that convey resistance or 
susceptibility to particular drug therapies by the altering the drug target 
structure. 

Structural changes that arise as a result of genetic polymorphisms 
are not of unlimited variety, since 3-D structure impacts upon function. A 
knowledge of the repertoire of the fine differences among generally similar 
3-D structures of particular proteins will permit design of drugs that bind 
to the most polymorphisms, drugs that induce the fewest side-effects, 
and drugs that are more effective against infectious agents. Knowledge 
of these structures ultimately will permit patient-specific or subpopulation- 
specific, such as ethic, age, or gender groups, design or selection of 
drugs. 
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The methods that are provided are for determining and using 3- 
dimensional (3-D) protein structures that are derived from genetic 
polymorphisms to understand differences in biological activity that result 
from the polymorphisms, and to use this understanding to aid in the 
identification of potential new drug candidates and drug therapies. Also 
provided are methods for analyzing 3-D structures of protein structural 
variant targets derived from genetic polymorphisms to identify common 
structural features among the variants; methods for identifying structural 
changes in target proteins that are associated with multiple mutations 
arising from genetic polymorphisms and correlating this information with 
biological activity; methods for using clinical data in conjunction with 
structural variants derived from genetic polymorphisms to understand and 
predict the pharmacological effects and clinical outcomes for drugs or 
potential drugs. Also provided are methods for generating 3-D protein 
structures derived from a given genotype to analyze protein-drug binding 
in silico to predict drug sensitivity or resistance. Also provided are 
databases that are used in methods provided herein and methods for 
generating the databases. 

In particular, target biomolecules are protein structural variants 
encoded by genes containing genetic variations, or polymorphisms. 3-D 
models of the structures of proteins are determined. The models are 
generated using molecular modeling techniques, such as homology 
modeling. The resulting models are then used in the methods provided 
herein, which include structure-based drug design studies to design and 
identify drugs that bind to particular structural variants; structure-based 
drug design studies and to predict clinical responses in patients; and to 
design drugs that bind to all or a substantial portion of allelic variants of a 
target, to thereby increase the population of patients for whom a 
particular drug will be effective and/or to decrease the undesirable side- 
effects in a larger population. 
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Hence, computer-based methods of drug design based on target 
protein structural models derived from genetic polymorphisms are 
provided. The methods involve obtaining one, preferably two or more 
amino acid sequences of a target protein that is the product of a gene 
exhibiting genetic polymorphisms, where sequences represent different 
genetic polymorphisms, and generating 3-D protein structural variant 
models from the sequences. Structure-based drug design techniques are 
used to design potential new drug candidates or to suggest modifications 
to existing drugs based on predicted intermolecular interactions of the 
drugs or drug candidates with the models. Alternatively, drug molecules 
can be computationally docked with 3-D protein structural variant models 
based upon the sequences and energetically refined before performing 
structure-based drug design studies. 

In preferred embodiments, binding interactions between a drug or 
potential new drug candidate molecules and the structural variants are 
calculated in order to optimize intermolecular interactions between drug or 
potential drug molecules and the structural variant models or to select 
drug therapies for patients by determining a drug or drugs that have 
favorable binding interactions with the structural variant models. 

In other embodiments, the binding interactions are determined by 
calculating the free energy of binding between the protein structural 
variant model and a docked molecule; and decomposing the total free 
energy of binding based on the interacting residues in the protein active 
site. 

After the protein structural variant models are generated, selected 
model structures are analyzed to determine common structural features 
that are conserved throughout the selected models. The conserved 
structural features can serve as scaffolds or pharmacophore models into 
which potential drugs or modified drugs are docked. For example, the 
selected model structures may represent the structural variants resulting 
from the most commonly occurring genetic polymorphisms or from 
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genetic polymorphisms found in a specific patient subpopulation, such as 
a particular age group, ethnic or racial group, sex, or other subpopulation. 
Alternatively, the models may be selected based on clinical information, 
for example, the structural variants may be derived based on patients 
receiving a specific treatment regimen or exhibiting a particular clinical 
response to a given drug or on the duration of a particular drug treatment. 

The methods provided herein can be used for predicting clinical 
responses in patients based on genetic polymorphisms. For example, a 
structural variant model derived from a subject, such as a human patient, 
exhibiting a particular genetic polymorphism is generated and screened 
against a number of reference protein structural variant models derived 
from genetic polymorphisms of the same gene in other such subjects. In 
certain embodiments, the reference structures are stored in a database, 
preferably with observed clinical data associated with the structures, or 
polymorphisms. The structural variant model from the subject is 
compared to a reference structures, for example, by database searching, 
in order to identify reference structural variants that are similar to the 
model structure derived from the subject. Based on the premise that 
structurally similar targets will have similar clinical responses, a clinical 
outcome can be predicted for the patient based on the structures 
identified through structural comparison or database searching. This 
information can also be used in the design and analysis of clinical trials; it 
can also be used for selecting appropriate therapies for a subject in 
instances in which the subject is a patient and the protein is a drug 
target. 

The methods are also used to design therapeutic agents that are 
active against biological targets that have become drug resistant, 
particularly due to genetic mutations. In certain embodiments, 3-D 
protein structural variant models are generated for a target protein in 
which genetic mutations have occurred and against which a given drug is 
no longer biologically active. The models are compared to 3-D protein 
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structural variant models of the target protein against which the drug has 
biological activity in order to identify structural differences between the 
susceptible and resistant targets. The differences can be used to 
understand the structural contributions to drug resistance, and this 
information can be utilized in structure-based drug design calculations to 
identify new drugs or modifications to the existing drug that circumvent 
the resistance problem. 

A computer-based method for identifying compensatory mutations 
in a target protein is also provided. The method involves obtaining the 
amino acid sequence of a target protein containing multiple amino acid 
mutations that is expressed in a patient, where the structure of a form of 
the target protein that responds to a particular drug, including the active 
site, has been structurally characterized; generating a 3-D structural 
model of the mutated protein; comparing the structure of the mutated 
protein with the form of the protein that responds to the drug to identify 
structural differences and/or similarities arising from the mutations; 
comparing the biological activities of the drug against the mutated protein 
and the form of the protein that responds to the drug to determine the 
effects of the mutations on drug response; and identifying the mutations 
in the protein that affect biological activity based on the comparisons. 
The target biolmolecules can also be used in a method referred to herein 
as computational phenotyping to predict drug sensitivity or resistance for 
a given genotype. These computer-based method for identifying 
phenotypes in silico are provided. The methods involve obtaining from a 
patient/specimen, such as a body fluid or tissue sample, including blood, 
cerebral spinal fluid, urine, saliva, sweat and tissue samples, the amino 
acid sequence of a target protein; generating a 3-D structural model of 
the target protein; performing protein-drug binding analyses; and 
predicting drug sensitivity or resistance based on the protein-drug binding 
analyses. 



-7- 



24737-1 906C 



Molecular structure databases containing protein structural variant 
models produced by the methods are also provided. The databases may 
also contain biological or clinical data associated with the structural 
variants. The databases can be interfaced to a molecular graphics 
package for visualization and analysis of the 3-D molecular structural 
models. In particular, databases containing the 3-D structures of 
polymorphic variants of selected target genes, particularly 
pharmaceutically significant genes with pharmaceutically significant gene 
products, such as proteases and polymerases, including reverse 
transcriptases, and receptors, such as cell surface receptors, are 
provided. The databases may be stored an provided on any suitable 
medium, including, but are not limited to, floppy disks, hard drives, CD- 
ROMS and DVDs. 

Also provided are relational databases for managing and using 
information relating to genetic polymorphisms. The databases contain 3- 
D molecular coordinates for structural variants derived from genetic 
polymorphism, a molecular graphics interface for 3-D molecular structure 
visualization, computer functionality for protein sequence and structural 
analyses and database searching tools. The databases may further 
include observed clinical data associated with the genetic polymorphism. 
The databases provide a means to design the allele-specific drugs and 
also to identify among alleles common or conserved structural features 
that can serve as the target for drug design. 

The databases can also be used for identiication of invariant 
residues and regions of a target biomoleucle, such as an HIV protease or 
reverse transcriptase. The identified invariant regions are then used to 
computationally screen compounds, preferably small molecules by 
assessing binding interactions. The compounds so-identified serve as 
candidates for drugs that will be effective for a larger proporation of a 
population or against a broader range of variants of a pathogen, where 
the target protein is from a pathogens. 
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Systems, including computers, containing the databases also are 
provided herein. Any computer known to those of skill in the art for 
maintaining such databases is contemplated. User interfaces for 
accessing and manipulating the databases and content thereof are also 
provided. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 illustrates a method for creating a protein structural variant 
relational database. 

FIG. 2 is a flow chart that describes one method used to generate 
structural variant models derived from genetic polymorphisms and to use 
the models in structure-based drug design studies. 

FIG. 3 is a flow chart that describes an alternative method used to 
generate structural variant models derived from genetic polymorphisms 
and to use the models in structure-based drug design studies. 

FIG. 4 shows the correlation between experimental and calculated 
changes of binding energy upon ligand modifications in the binding site of 
NS3. 

FIG. 5 shows a comparison of calculated versus experimental 
binding free energy changes for complexes of the tumor necrosis factor 
(TNF) receptor with different inhibitors. 

FIG. 6 shows the HIV PR inhibitors approved by the FDA. 

FIG. 7 shows the frequency versus amino acid residue plot of HIV 

PR. 

FIG. 8 shows frequency analysis of 10591 HIV PR Sequences, 
where ResNum is the residue number; TotOcc is the total occurrence of 
the mutation; Dist is the distance of the mutating residue from 
approximate center of active site (Asp28); WtAA is the amino acid in the 
wild type protein; NumMut is the number of mutations; and MutList is a 
list of amino acid mutations. 

FIG. 9 is a block diagram of an exemplary computer. 

FIG. 10 is a graphical representation of a relational database. 



-9- 



24737-1 906C 



FIG. 11 is a tabulation of the 3-D coordinates of a representative 
entry in a database that includes 3-D structures. 
DETAILED DESCRIPTION OF THE INVENTION 

A. Definitions 

B. Computer-based methods of drug design based on genetic 
polymorphisms 

1 . Methods for obtaining amino acid sequences of a target 
protein 

2. Generation of 3-D protein structural variant models 

a. Homology Modeling 

b. Ab initio generation of 3-D structures 

c. Crystal structures 

3. Use of 3-D structural variant models in drug design 

a. Selection of relevant structural variants 

b. Drug design 

c. Computational docking 

d. Free energy of binding studies 

C. Applications of computer-based methods 

1 . Genetic polymorphisms and structure-based drug design 

2. Drug resistance 

3. Identification of conserved structural features or 
pharmacophores 

4. Identification of compensatory structural changes 

5. Clinical Applications 

D. Creation of 3-D Structural Polymorphism Databases 

1 . Exemplary Databases and generation thereof 

2. Computer systems and Database 

E. Computational phenotyping 
A. Definitions 

Unless defined otherwise, all technical and scientific terms used 
herein have the same meaning as is commonly understood by one of skill 
in the art to which this invention belongs. All patents, patent 
applications, published patent applications and publications referred to 
herein are, unless noted otherwise, incorporated by reference in their 
entirety. In the event a definition in this section is not consistent with 
definitions elsewhere, the definition set forth in this section will control. 
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As used herein, polymorphism refers to a variation in the sequence 
of a gene in the genome amongst a population, such as allelic variations 
and other variations that arise or are observed. Genetic polymorphisms 
refers to the variant forms of gene sequences that can arise as a result of 
nucleotide base pair differences, alternative mRNA splicing or post- 
translational modifications, including, for example, glycosylation. Thus, a 
polymorphism refers to the occurrence of two or more genetically 
determined alternative sequences or alleles in a population. These 
differences can occur in coding and non-coding portions of the genome, 
and can be manifested or detected as differences in nucleic acid 
sequences, gene expression, including, for example transcription, 
processing, translation, transport, protein processing, trafficking, DNA 
synthesis, expressed proteins, other gene products or products of 
biochemical pathways or in post-translational modifications and any other 
differences manifested among members of a population. A single 
nucleotide polymorphism (SNP) refers to a polymorphism that arises as 
the result of a single base change, such as an insertion, deletion or 
change in a base. 

A polymorphic marker or site is the locus at which divergence 
occurs. Such site may be as small as one base pair (an SNP). 
Polymorphic markers include, but are not limited to, restriction fragment 
length polymorphisms, variable number of tandem repeats (VNTR's), 
hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide 
repeats, tetranucleotide repeats and other repeating patterns, simple 
sequence repeats and insertional elements, such as Alu. Polymorphic 
forms also are manifested as different mendelian alleles for a gene. 
Polymorphisms may be observed by differences in proteins, protein 
modifications, RNA expression modification, DNA and RNA methylation, 
regulatory factors that alter gene expression and DNA replication, and any 
other manifestation of alterations in genomic nucleic acid or organelle 
nucleic acids. 



24737-1 906C 



As used herein, structural variants proteins refer the variety of 3-D 
molecular structures or models thereof that result from the 
polymorphisms. These variants typically arise from transcription and 
translation of genes containing genetic polymorphisms, but also include 
diffentially glyocsylated or otherwise post-translationally modified variants 
that potentially exhibit differential interactions with drugs and drug 
candidates. 

As used herein, binding interactions refer to atomic or physical 
interactions between molecules including, but not limited to binding free 
energy, hydrophobic interactions, electrostatic interactions, steric 
interactions and other interactions that are commonly considered by those 
of skill in the art to determine the affinity of one molecule to bind to 
another. Favorable binding interactions refer to binding interactions that 
promote physical or chemical associations between molecules. 

As used herein, a target protein is defined as a protein that is a 
receptor with which drugs or other ligands, such as small molecule or 
peptide agonists or antagonists or other proteins or biomacromolecules, 
such as DNA or RNA, interact to bring about a biological response. 

As used herein, structure-based drug design refers to computer- 
based methods in which 3-D coordinates for molecular structures are 
used to identify potential drugs that can interact with a biological 
receptor. Examples of such methods include, but are not limited to, 
searching of small molecule libraries or databases, conformational 
searching of a ligand within an active site of identify biologically active 
conformations or computational docking methods. 

As used herein, pharmacogenomics refers to study of the variablity 
of patient responses to drugs due to inherent genetic differences. 

As used herein, computational docking refers to techniques 
wherein molecules, for example, a ligand and receptor or active site, are 
fitted together based on complementary interactions, for example, steric, 
hydrophobic or electrostatic interactions. 
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As used herein, energetic refinement refers to the use of molecular 
mechanics simulation techniques, such as energy minimization or 
molecular dynamics, or other techniques, such as quantum-based 
approaches, to "adjust" the coordinates of a molecular structural model to 
bring it into a stable, low energy, conformation. In molecular mechanics 
simulations, the potential energy of a molecular system is represented as 
a function of its atomic coordinates along with a set of atomic 
parameters, called a forcefield. Energy minimization refers to a method 
wherein the coordinates of a molecular conformation are adjusted 
according to a target function to result in a lower energy conformation. 
Molecular dynamics refers to methods for simulating molecular motion by 
inputting kinetic energy into the molecular system corresponding to a 
specified temperature, and integrating the classical equations of motion 
for the molecular system. During a molecular dynamics simulation, a 
system undergoes conformational changes so that different parts of its 
accessible phase space are explored. 

As used herein, clinical data refers to information obtained from 
patients pertaining to pharmacological responses of the patient to a given 
drug, including, but not limited to efficacy data, side effects, resistance or 
susceptibility to drug therapy, pharmacokinetics or clinical trial results. 

As used herein, patient histories, include medical histories and 
other any information, such as parental medical histories, dates and 
places of birth of the patient and parents, number of siblings, number of 
children and other such data. 

As used herein, compensatory mutations are mutations that act in 
concert with active site mutations by compensating for functional deficits 
caused by changes or mutations that affect binding in the active site. 

As used herein, a relational database is a collection of data items 
organized as a set of formally-described tables from which data can be 
accessed or reassembled in many different ways without having to 
reorganize the database tables. Such databases are readily available 
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commercially, for example, from Oracle, IBM, Microsoft, Sybase, 
Computer Associates, SAP, or multiple other vendors. 

As used herein, a phenotype refers to a set of parameters that 
includes any distinguishable trait of an organism. A phenotype can be 
physical traits and can be, in instances in which the subject is an animal, 
a mental trait, such as emotional traits. Some phenotypes can be 
determined by observation elicited by questionnaires or by referring to 
prior medical and other records. For purposes herein, a phenotype is a 
parameter around which the database can be sorted. 

As used herein, genotype refers to a specific gene or totality of 
genetic information in a specific cell or organism. 

As used herein, haplotype refers refers to two or more 
polymorphism located on a single DNA strand. Hence, haplotyping refers 
to identification of two or more polymorphisms on a single DNA strand. 
Haplotypes can be indicative of a phenotype. 

As used herein, a parameter is any input data that will serve as a 
basis for sorting the database. These parameters will include phenotypic 
traits, medical histories, family histories and any other such information 
elicited from a subject or observed about the subject. A parameter may 
describe the subject, some historical or current environmental or social 
influence experienced by the subject, or a condition or environmental 
influence on someone related to the subject. Paramaters include, but are 
not limited to, any of those described herein, and known to those of skill 
in the art. 

As used herein, computational phenotyping, refers to computer- 
based processes that assess the phenotype resulting from a particular 
genotype. The phenotype describes observables, such as, but are not 
limited to, the structure of the encoded protein, its functional 
morphological and structural attributes. In particular, as contemplated 
herein, the phenotype that is assesed is the interaction of a protein with a 
particular compounds, particularly a drug. As exemplified herein, the 
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method provides a means to select an effective drug for a particular 
subjects, particularly mammals, or class thereof. 

As used herein, a database refers to a collection of data; in this 
case data relating to polymorphic variants. Hence a database contains 
the nucleic acid sequences encoding the variants, or a portion of the 
variant, such as a portion contianing the active site or targetted site. 
Additionally, the database may contain other information related to each 
entry, including but are not limited to, the corresponding 3-D structure of 
the encoded protein (or a portion thereof) and information regaring the 
source of each sequence. Some of the entries in a database may be 
identical, and for purposes herein, a database contains at least 2 
different entries, typically far more than 2 entries. The number of entries 
depends upon the protein of interest and variety and number of 
polymorphisms that exist. Generally a database will have at least 10 
different entries, typically more than 100, more than 500, more than 
1000, more than 2000, 3000, 4000, 5000, 8000, 10,000, 50,000, 
100,000 and greater. Databases herein containing 20,000 entries and 
more have been generated and are exemplified herein. 

As used herein, a relational database stores information in a form 
representative of matrices, such as two-dimensional tables, including 
rows and columns of data, or higher dimensional matrices. For example, 
in one embodiment, the relational database has separate tables each with 
a parameter. The tables are linked with a record number, which also acts 
as an index. The database can be searched or sorted by using data in the 
tables and is stored in any suitable storage medium, such as floppy disk, 
CD rom disk, hard drive or other suitable medium. 

As used herein, a profile refers to information relating to, but not 
limited to and not necessarily including all of, age, sex, ethnicity, disease 
history, family history, phenotypic characteristics, such as height and 
weight and other relevant parameters. 
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As used herein, a biopolymer includes, but is not limited to, nucleic 
acid, proteins, polysaccharides, lipids and other macromolecules. Nucleic 
acids include DNA, RNA, and fragments thereof. Nucleic acids may be 
derived from genomic DNA, RNA, mitochondrial nucleic acid, chloroplast 
nucleic acid and other organelles with separate genetic material. 

As used herein, a DNA or nucleic acid homolog refers to a nucleic 
acid that includes a preselected conserved nucleotide sequence. By the 
term "substantially homologous" is meant having at least 80%, preferably 
at least 90%, most preferably at least 95% homology therewith or a less 
percentage of homology or identity and conserved biological activity or 
function. 

As used herein, a receptor refers to a molecule that has an affinity 
for a given ligand. Receptors may be naturally-occurring or synthetic 
molecules. Receptors may also be referred to in the art as anti-ligands. 
As used herein, the terms, receptor and anti-ligand are interchangeable. 
Receptors can be used in their unaltered state or as aggregates with other 
species. Receptors may be attached, covalently or noncovalently, or in 
physical contact with, to a binding member, either directly or indirectly via 
a specific binding substance or linker. Examples of receptors, include, but 
are not limited to: antibodies, cell membrane receptors surface receptors 
and internalizing receptors, monoclonal antibodies and antisera reactive 
with specific antigenic determinants (such as on viruses, cells, or other 
materials), drugs, polynucleotides, nucleic acids, peptides, cofactors, 
lectins, sugars, polysaccharides, cells, cellular membranes, and 
organelles. 

Examples of receptors and applications using such receptors, 
include but are not restricted to: 

a) enzymes: specific transport proteins or enzymes essential to 
survival of microorganisms, which could serve as targets for antibiotic 
(ligand) selection; 
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b) antibodies: identification of a ligand-binding site on the antibody 
molecule that combines with the epitope of an antigen of interest may be 
investigated; determination of a sequence that mimics an antigenic 
epitope may lead to the development of vaccines of which the 
immunogen is based on one or more of such sequences or lead to the 
development of related diagnostic agents or compounds useful in 
therapeutic treatments such as for auto-immune diseases; 

c) nucleic acids: identification of ligand, such as protein or RNA, 
binding sites; 

d) catalytic polypeptides: polymers, preferably polypeptides, that 
are capable of promoting a chemical reaction involving the conversion of 
one or more reactants to one or more products; such polypeptides 
generally include a binding site specific for at least one reactant or 
reaction intermediate and an active functionality proximate to the binding 
site, in which the functionality is capable of chemically modifying the 
bound reactant (see, e.g., U.S. Patent No. 5,215,899); 

e) hormone receptors: determination of the ligands that bind with 
high affinity to a receptor is useful in the development of hormone 
replacement therapies; for example, identification of ligands that bind to 
such receptors may lead to the development of drugs to control blood 
pressure; and 

f) opiate receptors: determination of ligands that bind to the opiate 
receptors in the brain is useful in the development of less-addictive 
replacements for morphine and related drugs. 

As used herein, prion refers to an infectious pathogen that causes 
central nervous system spongiform encephalopathies in humans and 
animals. No nucleic acid component is necessary for the infectivity of 
prion protein (see, e.g., U.S. Patent No. 5,808,969). 

As used herein, a ligand is a molecule that is specifically recognized 
by a particular receptor. Examples of ligands, include, but are not limited 
to, agonists and antagonists for cell membrane receptors, toxins and 
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venoms, viral epitopes, hormones (e.g. , steroids), hormone receptors, 
opiates, peptides, enzymes, enzyme substrates, cofactors, drugs, lectins, 
sugars, oligonucleotides, nucleic acids, oligosaccharides, proteins, and 
monoclonal antibodies. 

As used herein, complementary refers to the topological 
compatibility or matching together of interacting surfaces of a ligand 
molecule and its receptor. Thus, the receptor and its ligand can be 
described as complementary, and furthermore, the contact surface 
characteristics are complementary to each other. 

As used herein, a ligand-receptor pair or complex formed when two 
macromolecules have combined through molecular recognition to form a 
complex. 

The terms "homology" and "identity" are often used 
interchangeably. In this regard, percent homology or identity may be 
determined, for example, by comparing sequence information using a GAP 
computer program. The GAP program utilizes the alignment method of 
Needleman and Wunsch (J. Mo/. Biol. 48:443 (1970), as revised by Smith 
and Waterman (Adv. Appl. Math. 2:482 (1981). Briefly, the GAP program 
defines similarity as the number of aligned symbols (i.e., nucleotides or 
amino acids) which are similar, divided by the total number of symbols in 
the shorter of the two sequences. The preferred default parameters for 
the GAP program may include: (1) a unary comparison matrix (containing 
a value of 1 for identities and 0 for non-identities) and the weighted 
comparison matrix of Gribskov and Burgess, Nucl. Acids Res. 14:6745 
(1986), as described by Schwartz and Dayhoff, eds., ATLAS OF PROTEIN 
SEQUENCE AND STRUCTURE ', National Biomedical Research Foundation, 
pp. 353-358 (1979); (2) a penalty of 3.0 for each gap and an additional 
0.10 penalty for each symbol in each gap; and (3) no penalty for 
end gaps. 

Whether any two nucleic acid molecules have nucleotide sequences 
that are at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% 
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"identical" can be determined using known computer algorithms such as 
the "FAST A" program, using for example, the default parameters as in 
Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85:2444 (1988). 
Alternatively the BLAST function of the National Center for Biotechnology 
Information database may be used to determine identity 

In general, sequences are aligned so that the highest order match 
is obtained. "Identity" per se has an art-recognized meaning and can be 
calculated using published techniques. (See, e.g. : Computational 
Molecular Biology, Lesk, A.M., ed., Oxford University Press, New York, 
1988; Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, 
Part I f Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 
1 994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic 
Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, 
J., eds., M Stockton Press, New York, 1991). While there exist a number 
of methods to measure identity between two polynucleotide or 
polypeptide sequences, the term "identity" is well known to skilled 
artisans (Carillo, H. & Lipton, D., S/AM J Applied Math 48:1073 (1988)). 
Methods commonly employed to determine identity or similarity between 
two sequences include, but are not limited to, those disclosed in Guide to 
Huge Computers, Martin J. Bishop, ed., Academic Press, San Diego, 
1994, and Carillo, H. & Lipton, D., S/AM J Applied Math 45:1073 
(1988). Methods to determine identity and similarity are codified in 
computer programs. Preferred computer program methods to determine 
identity and similarity between two sequences include, but are not limited 
to, GCG program package (Devereux, J., et aL, Nucleic Acids Research 
12(f):387 (1984)), BLASTP, BLASTN, FASTA (Atschul, S.F., eta/., J 
Mo/ec Biol 2 75:403 (1990)). 

Therefore, as used herein, the term "identity" represents a 
comparison between a test and a reference polypeptide or polynucleotide. 
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For example, a test polypeptide may be defined as any polypeptide that 
is 90% or more identical to a reference polypeptide. 

As used herein, the term at least "90% identical to" refers to 
percent identities from 90 to 99.99 relative to a reference polypeptide. 
Identity at a level of 90% or more is indicative of the fact that, assuming 
for exemplification purposes a test and reference polynucleotide length of 
100 amino acids are compared. No more than 10% (i.e., 10 out of 100) 
amino acids in the test polypeptide differs from that of the reference 
polypeptides. Similar comparisons may be made between a test and 
reference polynucleotides. Such differences may be represented as point 
mutations randomly distributed over the entire length of an amino acid 
sequence or they may be clustered in one or more locations of varying 
length up to the maximum allowable, e.g. 10/100 amino acid difference 
(approximately 90% identity). Differences are defined as nucleic acid or 
amino acid substitutions, or deletions. 

As used herein, AMBER is a force field well known in the arts and 
designed for the study of proteins and nucleic acids as defined in Weiner 
et al. J. Comput. Chem. (1986) 7:230-252, where a modified AMBER 
(version 3.3) force field is a fully vectorized version of AMBER (version 
3.0) with coordinate coupling, intra/inter decomposition, and the option to 
include the polarization energy as part of the total energy. AMBER is 
available in commercially available molecular modeling programs such as, 
but not limited to, Macromodel (Columbia University). 

As used herein, ECEPP (Empirical Conformational Energies of 
Peptides Program) is a force field well know in the arts (US Patent No. 
5,910,478; 5,846,763). ECEPP/3 refers to version 3 of this well known 
force field. 

As used herein, QSAR refers to structure-activity relationship. 

As used herein, vdw refers to van der Waals. 

As used herein, RMSD refers to root mean-squared deviation. 
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As used herein, medical history refers to the parameters and data 
typically obtained by a physician when examining a subject or other such 
professional when examining other mammals, and includes such 
information as prior diseases, age, weight, height, sex and other 
information. For purposes, the subjects that serve as the source of the 
samples from which nucleic acids encoding polymorphisms are isolated, 
include animals, plants, pathogens and any organism that has nucleic acid 
that exhibits polymorphism. In this context medical history refers to 
information pertinent to the particular organism. 

As used herein, subject history, refers to data such as locale in 
which the subject was born, raised or resident or visited, and parental 
history and other such information. 

As used herein, a drug is an agent that binds to or interacts with a 

targeted protein. For purposes, a therapeutic agent is a drug. 

B. Computer-based methods of drug design based on genetic 
polymorphisms 

Methods for computer-based drug design based on genetic poly- 
morphisms are provided. The methods includes the steps of obtaining 
one or more, preferably two or more, amino acid sequences of a target 
protein that is the product of a gene exhibiting genetic polymorphisms; 
generating 3-dimensional (3-D) protein structural variant models of all or a 
portion of the protein from the sequences; and based upon the structures 
of the 3-D models, designing drug candidates or modifying existing drugs 
based on the predicted intermolecular interactions of the drug candidates 
or modified drugs with the structural variants or portions thereof by 
computationally docking drug molecules with the target protein models; 
and then, optionally energetically refining the docked complexes; 
determining the binding interactions between the drug or potential new 
drug candidate molecules and the models by calculating the free energy 
of binding of the docked complexes and decomposing the total free 
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energy of binding based on interacting residues in the protein active site 
or sites deemed important for protein activity. 

A variety of methods that include these steps are provided. Such 
methods have particularl application, for example, in predicting patient 
responses. As noted, patients exhibit variable responses to drugs. For 
some patients a drug may be very beneficial and achieve a desired 
response; whereas for other patients, with the same disorder, the same 
drug will have little or no effect. It is known that individuals as well as 
groups of individuals exhibit a variety of genetic polymorphisms. As 
described herein, the presence or absence of such polymorphisms can be 
correlated with the variability of patient responses to drugs. 

It is shown herein that by understanding how genetic poly- 
morphisms affect 3-D protein structure of a drug target, for example, it is 
possible to ascertain the interaction of a particular drug with the target in 
a particular patient or groups of patients. Based upon this interaction, the 
outcome can be predicted. It will be possible to determine whether a 
patient will benefit from a drug or be at risk for a particular side effect. It 
is possible to predict these responses before exposure to the drug. These 
methods also permit rational design of drugs that can treat various 
populations or ultimately even individuals. These differences and effects 
can also be taken into account to design drugs that are not dependent 
upon a particular polymorphism. 

Hence, the knowledge derived from understanding the effects of 
genetic polymorphisms can be used to develop and apply therapeutics 
more effectively, make clinical trials more successful, for example, by 
permitting selection of test subjects with the same polymorphism or with 
polymorphisms for which the drug is designed to interact effectively. 

It is shown herein that it is advantageous to use 3-D molecular 
structures in drug design rather than to consider primary sequence alone. 
For example, most drugs target proteins either in the afflicted organism or 
in a pathogen. Disease, drug action and toxicity are all manifested at the 
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protein level. Although the nucleotide sequences of genetic 
polymorphisms might appear to be quite different, the resulting protein 
targets may have similar shapes and, therefore, the protein biological 
function might be the same. Conversely, although genetic polymorphism 
sequences might appear similar, the resulting proteins may have critical 
differences in their 3-D structures that greatly affect biological activity. 
Thus, use of 3-D protein structure models in such methods provide 
advantages not heretofor realized. Methods for generating 3-D structures 
are known to those of skill in the art and are also provided herein. 

Once the protein target structural models have been selected, 
structure-based drug discovery methodologies, for example, 
computational screening or docking programs and methods (e.g., DOCK 
(available from University of Ca, San Francisco; and AUTODOCK available 
from Scripps Research Institute, La Jolla), are used to design biologically- 
active compounds based on the 3-D structures of the biomolecular 
receptors. Using these methods, drug designers can identify and 
computationally rank the various potential clinical drug candidates for 
maximum efficacy, thereby performing drug discovery in silico and 
avoiding the tedious time and expense associated with in vitro drug 
discovery methods. 

In addition to drug design applications, the information derived from 
studying the structures of biological targets can be used to understand 
and predict biological responses in patients, such as efficacy, toxicity, 
drug resistance and other pharmacological effects. Since human clinical 
trials may cost upwards of $100-300 million, it is desirable to predict the 
outcome to the greatest extent possible for each prospective drug 
candidate so that the best prospective drug candidates are advanced to 



-23- 



24737-1906C 



clinical trials. As described below, methods are provided herein for 

selecting populations for clinical trials. 

1 . Methods for obtaining amino acid sequences of a target 
protein 

Any protein or gene or encoded mRNA that exhibits 
polymorphisms, herein referred to as the target protein, in structure is 
contemplated for use herein and for generating the databases as provided 
herein. The target protein is a protein, polypeptide, or oligopeptide that 
includes, but is not limited to, receptors, enzymes, hormones, prions, or 
any such compound with which drugs or other ligands, such as small 
molecules, peptide agonists, peptide antagonists, other proteins, nucleic 
acids and other biormacromolecules, interact to bring about a biological 
response. These target proteins occur in any organism, including plants 
and animals, eukaryotes and prokaryotes, including pathogens, such as 
protozoans, parasites, viruses, includind DNA and retroviruses, and 
bacteria. The protein or gene can be one expressed in the organism, such 
as molecule targeted for drug interaction , or one expressed in a 
pathogen. 

The target gene is one that exhibits polymorphisms (i.e., sequence 
variations among a population) and the target protein is the product of a 
gene exhibiting genetic polymorphisms, or sequence variations, as 
described herein. Any gene or protein that exhibits polymorphisms is 
contemplated herein. In particular, genes that encode proteins, 
polypeptides, or oligopeptides that are targets for drug interaction are 
contemplated herein. The genetic polymorphisms can occur in the genes 
of pathogens (e.g. viruses, bacteriae, and fungi), parasites, plants, 
animals, and humans. As such, the sequence a target protein can be 
obtained by the isolation and analysis of the gene or gene product in 
samples taken from pathogens, parasites, plants, animals, and humans, 
most preferably from humans. 
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The genes or proteins may be isolated from any source, such as 
animal or plant specimens, or the sequences obtained from any source, 
including known databases. If starting with gene sequences that include 
single or multiple nucleotide polymorphisms, the amino acid sequences of 
the translated proteins can be determined. Protein isolation and 
sequencing methods are well known to those of skill in the art. 
Alternatively, samples of the target protein can be obtained and 
sequenced directly from specimens. Multiple sequence analyses can be 
performed to determine the exact amino acid variations or mutations 
resulting from the genetic polymorphisms. 

Amino acid sequences of target proteins can also be obtained from 
data banks and databases (e.g. GenBank, Swiss Prot, PIR) and from 
publications and other sources in which numerous polymorphisms have 
been identified and mapped. Samples may be obtained from, for example 
blood and tissue banks, nucleic acid isolated, genes selected or identified 
and polymorphims can be mapped from such samples. 

2. Generation of 3-D protein structural variant models 

After the amino acid sequences of target proteins are obtained via 
the means described in section 1 , the 3-D structural models of the 
sequences of native proteins or of the protein structural variants are then 
determined. They can be determinedthrough experimental methods, such 
as x-ray crystallography and NMR, and from structure databases, such as 
the Protein Databank (PDB). Moreover, 3-D structural models can be 
determined by using any of a number of well known techniques for 
predicting protein structures from primary sequences (e.g. SYBYL (Tripos 
Associated, St. Louis, Mo.), de novo protein structure design programs 
(e.g. MODELER (MSI, Inc., San Diego, CA) and MOE (Chemical 
Computing Group, Montreal Canada) and ab initio methods, see, e.g., 
U.S. Patent Nos. 5,331,573, 5,579,250 and 5,612,895), homology 
modeling, and ab initio computational analysis. Homology modeling, 
structure determination based upon x-ray crystallographic structures, and 
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ab initio techniques and combinations of these methods are among those 
preferred herein. 

a. Homology Modeling 

Homology modeling is based on the relationship between protein 
evolutionary origin, function and folding patterns. Proteins of related 
origin and function have conserved sequences and structural features 
among the members of a homologous family. Using these relationships, a 
three-dimensional structural model for a protein of unknown structure can 
be constructed by using composite parts of related proteins in the same 
family. Where only the primary amino acid sequence of a target protein is 
known, the sequence can be compared to the sequences of related 
proteins with known structures (reference proteins), and a model can be 
built by incorporating the structural attributes of the reference protein 
together with the sequence of the target protein. 

Sequence homology calculations generally require: the amino acid 
sequence of the target protein; a high resolution structure for at least one, 
but preferably more, related reference proteins; and any other related 
amino acid sequences. The reference proteins include structures which 
are similar to the target protein, either by sequence, fold, function, or 
which are polymorphisms of the target protein. The more related protein 
structures and sequences that are available or determined, the more 
reliable the technique will be at providing an accurate model. 

In constructing a protein model using homology modeling, se- 
quence alignment is performed between the target sequence and any 
known structures within the protein family. Sequence alignment requires 
determining the similarity between protein sequences by maximizing the 
number of matches between the sequences while introducing the mini- 
mum number of insertions and deletions. Sequence alignment algorithms 
are well known in the art, and standard gap penalties {i.e., programs that 
automatically introduce gaps to maximize alignment and then adjust the 
percentage of identity by applying penalties for gap number and gap 



-26- 



24737-1 906C 



length) and other parameters can be selected by the skilled artisan. 
Additionally, the 3-D structures of the known reference proteins, 
preferably, are aligned to give the best overall fit for the proteins in the 
family. This provides indication of structurally-conserved regions, such as 
regions of the proteins that do not contain insertions or deletions, among 
the reference structures. 

Once the sequences are aligned and the structurally-conserved re- 
gions are identified, the coordinates of the reference proteins can be used 
to construct a 3-D model of the target structure. Coordinates from the 
protein backbone of the reference proteins are then used to construct the 
backbone framework for the target protein structure. Side chains can be 
constructed, for example, by using side chain coordinates from the 
reference proteins, searching from a database to obtain side chain 
conformations that fit in with the existing structural framework or by 
generating side chains ab initio to establish energetically favorable side 
chain conformations. 

The non-conserved regions of the unknown protein can be con- 
structed, for example, using database searching, A database of known 
protein structures (e.g., PDB) can be searched to identify variable regions 
in other proteins that have a high degree of sequence similarity to the 
target sequence and that fit onto the existing structural framework of the 
protein model. Algorithms for performing sequence similarity matching 
and homology model building are well known in the art and are available 
commercially (available from Molecular Simulations, Inc., Tripos, Inc. and 
from numerous academic sources). 

The variable regions can also be modeled by fitting the target 
sequence to a peptide backbone generated by varying phi and psi angles 
(e.g., by calculating Ramachandran or Balasubramanian plots, see, 
Balasubramanian (1974) "New type of representation for Mapping Chain 
Folding in Protein Molecules," Nature 266:856-857) or Balaji plots, see, 
U.S. Patent Nos. 5,331,573, 5,579,250 and 5,612,895) of the amino 
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acids to give a loop structure that can be integrated into the model 
structure based on a sterically and energetically reasonable fit (Figure 1). 

In a Balasubramanian plot, the peptide is depicted as a series of 
different vertical lines, each having solid dots and open circles aligned 
with the corresponding <p, qj angle values on the vertical axis, and where 
each line corresponds to the particular number of the residue having the 
plotted <p, ifj angles as indicated on a horizontal axis. In the Balaji plot, 
the values of the <p, ip angles are shown as the base and tip of a vertical 
wedge (assuming a vertical angular axis), respectively, with a separate 
wedge being horizontally positioned on the plot as a function of the 
residue number of the <p, yj angles plotted. The Balaji plot replaces the 
solid dots and open circles of the Balasubramanian Plot with the base of a 
wedge and the tip of a wedge, respectively; and further replaces the 
vertical line joining the dots and open circles of the Balasubramanian plot 
with the body of the wedge. 

b. Ab initio generation of 3-D structures 

Alternatively, ab initio methods can be used in combination with an 
existing partial homologous structure to generate unresolved portions of 
the target structure. Such methods are described, for example, in U.S. 
Patent Nos. 5,331,573, 5,579,250 and 5,612,895, which as all patents, 
applications and publications referenced herein, are each incorporated in 
their entirety. These methods involve: simulating a real-size primary 
structure of a polypeptide in a solvent box, i.e., an aqueous environment; 
shrinking the size of the peptide isobarically and isothermally; and 
expanding the peptide to its real size in selected time periods, while 
measuring the energy state and coordinates, i.e. , the bonds, angles and 
torsions of the expanding molecule. As the peptide expands to its full 
size, it assumes a stable tertiary structure. In most cases, due to the 
manner in which the expansion occurs, this tertiary structure will be 
either the most probable structure [i.e., it will represent a global minimum 
for the structure) or one of the most probable structures. The energy 
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equations used to perform the ab initio simulation are based on the 
potential energy of the simulated molecule as described using molecular 
mechanics. 

Once a model is built, it can be refined using energy minimization, 
molecular dynamics calculations, or simulated annealing as described 
herein. The steric and energetic quality of the structural models is then 
evaluated by analyzing the structural attributes of the model, such as phi 
and psi angles (e.g., by calculating Ramachandran or Balasubramanian or 
Balaji plots), or the energetics of the model, such as by calculating energy 
per residue or strain energy. If the overall quality of the model is not 
satisfactory, further iterative energy refinement can be performed until the 
model is considered to be acceptable (i.e., e av < 1 .5, see below). 

A preferred method for generating and refining the structural 
variant models is illustrated in FIG. 1. First, at block 100 of FIG. 1, 
protein sequence information, derived genetic polymorphisms, is obtained 
from the methods described earlier. At block 102, the protein is assigned 
to a protein superfamily in order to identify related proteins to be used as 
templates to construct a 3-D model of the protein. If the superfamily is 
not known, sequence analysis or structural similarity searches can be 
performed to identify related proteins for use as templates in homology 
modeling studies, as described herein, as indicated at block 104. 

Once the conserved regions of the model are assembled, ab initio 
loop prediction (Dudek et al. (1998) J. Comp. Chem. 75:548-573) 
indicated at 106A or ab initio secondary structure generation techniques 
of block 106B, techniques in which the alignments are adjusted using 
information on the secondary structure, functional residues, and disulfide 
bonds as described herein, can be used to complete the model (e.g. U.S. 
Patents Nos. 5,331,573; 5,579,250; and 5,612,895). This model, 
complete with loops, is then subjected to refinement procedures (block 
1 10) based on molecular mechanics, molecular dynamics, and simulated 
annealing methods. Energetic refinement of the structure can be 
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accomplished by performing molecular mechanics calculations using, for 
example, an ECEPP type forcefieid (Dudek et at. (1998) J. Comp. Chem. 
75:548-573) or through molecular dynamics simulations using, for 
example, a modified AMBER type forcefieid (Ramnarayan et aL (1990) J. 
Chem. Phys. 92:7057-7076. As known to those of skill in the art a 
modified AMBER (version 3.3) force field is a fully vectorized version of 
AMBER (3.0) with coordinate coupling, intra/inter decomposition, and the 
option to include the polarization energy as part of the total energy (see, 
e.g., Weiner et aL (1986) J. Comp. Chem. 7:230-252). If necessary, the 
3-D structures can be dynamically refined, for example, by using a 
simulated annealing protocol (e.g.,, 100 ps equilibration, 500 ps 
dynamics, up to 1000°K, 1 fs data collection). 

The refinement process step 1 10 is used to offset problems that 
may arise when homology models are not built carefully or when they are 
built using fully automated methods. Problems that may arise include 
chain breaks (e.g. consecutive C° atoms are farther apart than the 
optimum distance of 3.7 to 3.9 A); distorted geometry (e.g. bond lengths 
and bond angles are too far from their optimal values); c/s-peptide bonds 
(e.g., incorrect isomerization of the peptide backbone in non-proline 
residues when it is not required); disallowed backbone and side-chain 
conformations (e.g. , dihedral angles do not satisfy the Ramachandran plot 
(see, Balasubramanian (1974) Nature 266:856-857) criteria for a fully 
favorable protein structure conformation); and misfolded loops (e.g. non- 
homologous loops are generated in unnatural conformations). The 
refinement procedure 110 removes distortions of covalent geometry by 
using energetic methdods, converts disallowed backbone and side-chain 
conformations into allowed ones using simulated annealing methods, 
conserves protein core structure and secondary structural elements built 
by homology, and rebuilds unnatural loop constructions (Dudek et aL 
(1 998) J. Comp. Chem. 75:548-573). 
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For quality control (block 1 12), the protein structural 
characteristics, for example, stereochemistry (e.g.,, phi/psi and side chain 
angles), energetics (e.g.,, strain energy), packing profile (e.g. ff packing 
factor per residue) and hydrophobic packing are evaluated and required to 
meet acceptable criteria before the structures are used in further studies 
or inputted into a structural polymorphism database. Quality control 
using strain energies entails computing normalized residue energies 
(NREs) based on the equation: 

e, = [E(i,X) - E AV (X)] / E SD (X), where 

E(i,X) is the energy of interactions of amino acid X in position i with 
protein environment and solvent; 

E AV (X), E SD (X) is the average residue energies and their standard 
deviations calculated for 20 amino acids in more than 100 high-quality 
crystal structures; and 

NREs characterize how favorable the interactions of each residue 
are within the protein environment (Majorov and Abagyan, (1998) Folding 
& Design 3:259). 

The average NRE characterizes the overall quality of a protein structure 
and is defined as: 

e av = (1/N) Z s e f/ where 

e av < 0.5 denotes high-resolution X-ray crystal structures; 
e av < 1.0 denotes good as NMR and theoretical models; and 
e av > 1 .5 denotes structures that require further refinement. 
After the quality of structure is determined at block 1 12, the model is 
checked at block 1 14 to determine if it is satisfactory. If the overall 
quality of the model is not satisfactory, a "No" outcome at block 1 16, 
then remedial action is undertaken to fix problems at block 118, including 
further iterative energy refinement (block 1 10), and repeated checking 
(block 114). The refinement and evaluation is repeated until the model is 
considered to be acceptable, a "Yes" outcome at block 120, whereupon 
structural and/or physical properties (e.g. energetics and phi/psi angles) 
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are calculated at block 122A and clinical data (if available) is obtained at 
block 122B. The model is then inputted into a structural polymorphism 
database at block 124. 

FIG. 2 shows an exemplary method for generating structural variant 
models derived from genetic polymorphisms and using them in structure- 
based drug design studies. At the block numbered 200, patient data is 
acquired for a gene that exhibits genetic polymorphisms. Protein 
sequence information is then derived, at block 202. A check is made for 
determination of the 3-D structure of the native protein. If the 3-D 
structure has been determined, a "Yes" outcome at block 206, then a 
multiple sequence analysis is performed at block 208 to determine the 
exact amino acid variations for the structure. If the 3-D structure has not 
been determined, a "No" outcome at block 210, then the structure is 
determined using physiochemical methods at block 212. 

Next, at block 214, the 3-D structural models for all variants are 
generated. A refinement process is then completed at block 216 for the 
structural models. As noted above in connection with FIG. 1, the process 
involves subjecting each model, complete with loops, to refinement 
procedures based on molecular mechanics, molecular dynamics, and 
simulated annealing methods. As before, the energetic refinement of the 
structure can be accomplished by performing molecular mechanics 
calculations using an ECEPP type forcefield (Dudek et al. (1998) J. Comp. 
Chem. 79:548-573), or through molecular dynamics simulations using, for 
example, a modified AMBER type forcefield (Ramnarayan et al. (1990) J. 
Chem. Phys. 92:7057-7076), where a modified AMBER (version 3.3) 
force field is a fully vectorized version of AMBER (3.0) with coordinate 
coupling, intra/inter decomposition, and the option to include the 
polarization energy as part of the total energy (Weiner et al. (1986), J. 
Comp. Chem. 7:230-252). If necessary, the 3-D structures can be 
dynamically refined, for example, by using a simulated annealing protocol 
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(e.g.,, 100 ps equilibration, 500 ps dynamics, up to 1000°K, 1 fs data 
collection). 

At block 218, a quality evaluation is performed for all the models. 
As described in connection with the quality evaluation process in Fig. 1, 
the evaluation at block 218 involves evaluating the protein structural 
characteristics, for example, stereochemistry (e.g., phi/psi and side chain 
angles), energetics (e.g., strain energy), packing profile (e.g., packing 
factor per residue) and hydrophobic packing, which must meet acceptable 
criteria before the structures are used in further studies or inputted into a 
structural polymorphism database. 

After the model quality is determined, at block 220 the models are 
checked to determine if they are satisfactory for further use. If a model is 
not satisfactory, a "No" outcome at block 222, then the problems are 
identified and solved with remedial action at block 224. The remedial 
action may include further iterative energy refinement at block 216 and 
repeated checks of model quality at block 218. Once the models are 
satisfactory, a "Yes" outcome at block 226, structure-based drug design 
methods are applied at block 228 to identify potential new drugs that 
bind to the structural variant models. The drug design methods are 
described further below. 

FIG. 3 shows another exemplary and alternative method for 
generating structural variant models derived from genetic polymorphisms 
and using them in structure-based drug design studies. The process of 
FIG. 3 is similar to the process of FIG. 2 from the initial process at block 
300 of acquiring patient data for a gene that exhibits genetic 
polymorphisms through the process of obtaining models that are 
satisfactory (a "Yes" outcome at block 326). Thus, block numbers in 
FIG. 3 from 300 through 326 that correspond to FIG. 2 blocks numbered 
from 200 thorough 226 refer to similar operations. Unlike FIG. 2, 
however, the process illustrated in FIG. 3 then involves docking 
operations. 
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At block 328, once the models are determined to be satisfactory, 
drug molecules are docked with the structural variant models. Next, at 
block 330, the free energy of binding is evaluated with the potential drugs 
under study for each structural variant model. At block 332, the total 
free energy of binding is decomposed, based on the interacting residue in 
the protein active site. Lastly, at block 334, the free energy of binding is 
correlated with patient data, if the data is available. Thus, the 3-D 
structural data is employed in drug design. Details of using such 
structural data in drug design are described further below, 
c. Crystal structures 

The crystal structure of any protein can be determined empirically 
and the resulting coordinates used as the basis for determing structures 
of variants. Such structures are often known (see, e.g., Kohlstaedt et al. 
(1992) Science 256:1773-1790 for a crystal structure of HIV-1 RT bound 
to a ligand). 

3. Use of 3-D structural variant models in drug design 

The structural differences in protein structural variants that arise 
due to genetic polymorphisms can have profound effects on biological 
activity. Because of the structural differences among the variants, they 
may have different physical or reactive properties and therefore may 
exhibit different biological activities. These differences may include, for 
example, different responses to a given drug, so that a drug which works 
well in a patient with one particular genetic polymorphism may not work 
as well in another patient exhibiting a different polymorphism. 

The 3-D molecular structures of drug targets derived from genetic 
polymorphisms can be used in structure-based drug design studies to 
greatly advance the development of new pharmaceuticals. Relational 
databases of these 3-D structures that are derived from samplings of 
genetic polymorphisms over a patient population or a cross-section of the 
population can be used to design potential drugs in order to optimize 
effectiveness for the particular population. 
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The structures and databases described herein can provide 
information that is useful, for example, in designing a drug that is 
effective in the greatest percentage of the population. It is desirable that 
a given drug is effective in the largest percentage of the population, since 
such a drug is likely to have the greatest clinical utility and thus the 
greatest commercial value. A drug with superior performance properties 
is sometimes referred to as a "best in class" drug and is highly prized by 
pharmaceutical companies since this heralds market leadership and the 
likelihood of commercial success. The databases and methods described 
herein can be used to determine 3-D protein structures for drug targets 
that are associated with particular genetic polymorphisms and to use the 
structures in drug design studies for design and optimization of candidate 
drugs that exhibit activity over the broadest patient population. 

Genetic polymorphisms may result in target protein structural 
variants in which drug efficacy correlates with specific populations or 
subpopulations. In some cases, it might be desirable to target drug 
design or drug therapy toward a specific patient population, such as a 
particular race, gender, or age group, affected by a certain disease or 
condition or toward those having a specific genetic polymorphism. The 
information derived from comparing the 3-D structural variants arising 
from different genetic polymorphisms may be useful for understanding 
why drugs are active or inactive in different subpopulations, or for 
assisting in developing new drugs to maximize efficacy across specific 
populations. 



The structural variant models in the structural polymorphism 
database provided herein can be used to design new drugs or to select a 
drug therapy that would be appropriate for a patient exhibiting a particular 
genetic polymorphism. As it may not be possible for a drug to work 
equally well for all polymorphisms, and thus all patients, representative 



a. 



Selection of relevant structural variants 



-35- 



24737-1 906C 



structural variants can be selected for use in drug design studies in order 
to maximize biological activity based on genetic polymorphisms. 

In some cases, structural variants are analyzed to determine the 
common structural features that are conserved through the selected 
models. These conserved features are used as a basis for drug design. 
In some cases, the structural variant corresponding to the genetic 
polymorphism occurring most commonly in a population can be selected 
for use in identifying drugs that would be effective in the greatest 
percentage of the population. Optionally, structural variants 
corresponding to a relevant subpopulation, such as a particular gender, 
age, race, or other characteristic, can be selected for use in designing 
drugs that are active in that subpopulation. In other cases, individual 
structural variant models can be selected for use in designing drugs that 
are specifically active against one target in one individual arising from a 
particular genetic polymorphism. Additionally, model structures that 
represent variants derived from patients that receive a specific treatment 
regimen or exhibit a particular clinical response (e.g. drug resistance) to a 
given drug are used as bases for drug design. 

The relevant structural variants may be identified using the 
structural analysis tools described herein, optionally in combination with 
database and statistical analysis tools that permit a complete analysis and 
comparison of the molecular structures and properties of the structural 
variants. The structural variants selected based on the criteria including, 
but not limited to, those listed above are used in drug design, 
b. Drug design 

Once the protein target structural models have been selected, 
structure-based drug discovery methodologies, for example, 
computational screening or docking (e.g., DOCK (available from University 
of Ca, San Francisco; and AUTODOCK available from Scripps Research 
Institute, La Jolla and others referenced herein or known to those of skill 
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in the art), can then be used to design biologically-active compounds 
based on the 3-D structures of the biomolecular receptors. 

Using these methods, drug designers can identify and 
computationally rank various potential clinical drug candidates for 
maximum efficacy, thus cutting the time and expense associated with 
drug discovery. The preferred design of drug candidates or the 
modification of existing drugs is based on the intermolecular interactions 
between the drug candidate or modified drugs and the selected structural 
variants predicted by computationally docking drug molecules with the 
target protein models; energetically refining the docked complexes; 
determining the binding interactions between the drug or potential new 
drug candidate molecules and the models by calculating the free energy 
of binding of the docked complexes and decomposing the total free 
energy of binding based on interacting residues in the protein active site 
or sites deemed important for protein activity. 

c. Computational docking 

Methods for using the structural variant models to design potential 
new drugs or to aid in the selection of a drug therapy based on the 
interactions of selected small molecules with the particular variants are 
provided. Structure-based drug design experiments, such as 
computational screening or docking studies, calculation of binding 
energies or analysis of steric, electrostatic or hydrophobic properties of 
the resulting structural variant models, can be performed on selected 
structural variant models to aid in the understanding of observed 
biological activities or to determine new potential drug candidates to bind 
to the particular target. 

In a typical computational docking protocol, the active site, or sites 
deemed important for protein activity, of the protein model is defined. A 
molecular database, such as the Available Chemicals Directory (ACD) or 
any database of molecules, is screened for molecules that complement 
the protein model. Solvation parameters are factored in (see, e.g., 
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Shoichet efaA (1999) PROTEINS: Structure, Function, and Genetics 34:4- 
16). In these computational docking studies, drugs or drug candidates 
are fitted to the structural variant models based on complementary 
interactions (e.g., steric, hydrophobic, or electrostatic interactions). 
Methods for performing such studies are well known and software tools 
for performing the calculations are widely available (M. Lambert, "Docking 
Conformational Flexible Molecules into Protein Binding Sites" in Practical 
Application of Computer-Aided Drug Design, Charifson, Ed., Marcel 
Dekker, NY, pp. 243-303; Kurtz (1992) Science 257:1078-1082; Kuntz 
et al. (1982) J. Mol. Biol. 161:269-288; Stewart eta/. (1992) Med. 
Chem. Res. 7:439-443; Shoichet ef a/. (1993) Science 259\\ 445-1 450; 
Shoichet et al. (1991) J. Mol. Biol. 227:327-346). 

New potential drug candidates can be designed by identifying 
potential small molecule drugs that can bind to a particular structural 
variant. This is accomplished, for example, by methods including, but are 
not limited to, methods for electronic screening of small molecule 
databases as described herein, methods involving modifying the 
functional groups of existing drugs in silico, methods of de novo ligand 
design. Methods for computationally desiging drugs are known to those 
of skill in the art and include, but are not limited to, DOCK (Kuntz et al. 
(1982) "A Geometric Approach to Macromolecule-Ligand Interactions", J. 
Mol. Biol., 161:269-288; available from University of Ca, San Francisco); 
and AUTODOCK (see, Goodsell et al. (1990) "Automated Docking of 
Substrates to Proteins by Simulated Annealing", Proteins: Structure, 
Function, and Genetics, 8, pp. 195-202; available from Scripps Research 
Institute, La Jolla); GRID (Oxford University, Oxford, UK); CAVEAT (UC 
Berkeley, Ca), LEGEND (Molecular Simulations, Inc., San Diego, CA); 
LUDI (Molecular Simulations, Inc., San Diego, CA); HOOK (Molecular 
Simulations, Inc., San Diego, CA); CLIX (CSIRO, Australia); GROW 
(Upjohn Laboratories, Kalamazoo); others including HINT, LUDI, 
NEWLEAD, HOOK, PRO-LIGAND and CONCERTS (see, M. Murcko, "An 
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Introduction to De Novo Ligand Design" in Practical Application of 
Computer-Aided Drug Design, Charifson, Ed., Marcel Dekker, NY 7 pp 305- 
354), methods based on QSAR (quantitative structure-activity 
relationships, QSAR and Drug Design: New Developments and 
Applications, Fugita, Ed., (1995) Elsevier, pp 3-81; 3D QSAR in Drug 
Design, Kubinyi, Ed., (1993) Escom, Leiden), and other methods known 
to those of skill in the art for determining molecules that have optimal 
binding interactions with a selected target. 

The docked complexes, if needed, are further refined energetically 
to optimize geometries within the binding site and to select the best 
structure from a set of possible structures, using molecular mechanics, 
molecular dynamics, and simulated annealing techniques, including those 
described herein and others that are known to those skilled in the art. 
d. Free energy of binding studies 

After the computational docking step, the free energy of binding of 
the docked complex is calculated, and the total free enegy of binding is 
decomposed based on the interacting residues in the protein active site or 
sites deemed improtant for protein activity. Analyses of the binding 
energies are needed to identity drug candidates. If need or desired, the 
free energy of binding of different drugs or potential drugs to each 
structural variant model can be calculated by substracting the free energy 
of the non-interacting protein and drug from the free energy of the 
protein-drug complex. The total free energy of binding is decomposed 
into its various thermodynamic components, e.g. enthalpic and entropic 
components, based on the interacting residues in the protein active site in 
a solvated model to characterize the structural and thermodynamic 
features in the mode of drug binding and to determine the contribution of 
the solvent] (see, e.g., Wang eta/. (1996) J. Am. Chem. Soc. 7 75:995- 
1001; Wang et al. (1995) J. Mol. Biol. 253:4-73-492; Ortiz et al. (1995) 
J. Med. Chem. 35:2681-2691, which describes a computational method 
for deducing QSARs from ligand-macromolecule complexes). Following 
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the computational drug design protocol described herein, any potential 
new drugs that are identified can be synthesized in, for example, industry 
or academia, and subjected to further biological testing, such as in vitro 
studies or pre-clinical and clinical in vivo testing. 

Based on the predicted intermolecular interactions of the drugs or 
modified drugs with the structural variant models from binding studies, 
potential drug candidates that are specific for a protein with a selected 
polymorphism or that specifically interact with all proteins exhibiting the 
polymorphism can be identified. 

It is also possible to individualize drug design or drug therapy by 
determining the structural variants associated with a particular patient and 
then designing or screening drugs or potential drugs to maximize efficacy 
in that subject or in a subpopulation that exhibits the same genetic 
polymorphism. The variants may also be used to track polymorphic 
variations in infectious organisms, such as viruses. For example, the 
human immunodeficiency viruses (HIVs) reverse transcriptase and 
protease have served as drug targets (see, Erickson eta/. (1996) Ann. 
Rev. Pharmacol. Toxicol 35:545-571); their three-dimensional structures 
are known (see, e.g., Nanni et al. (1993) Perspectives in Drug Discovery 
and Design 7:129-150; Kroeger et al. (1997) Protein Eng. 70:1379- 
1383). The clinical emergence of drug-resistant variants of these viruses 
has limited the long-term effectiveness of drugs targeted against these 
enzymes. 

As noted, these enzymatic proteins in order to preserve function 
must exhibit conserved 3-D structures. The methods herein permit design 
of drugs specific for the conserved regions of the 3-D structures. They 
also permit selection of drug regimens based upon the alleles expressed. 
Hence, methods for designing HIV enzyme-specific drugs are provided. 
Flow charts illustrating exemplary alternative embodiments using protein 
3-D structures derived from genetic polymorphisms in structure-based 
drug design studies are provided (see, Figs. 2 and 3). In the flow charts 
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depicted in these figures, the drug design includes structure-based drug 
design methods (see, Figure 2) and computational docking of drugs with 
structural variants, evaluation of the binding energy of the docked 
complexes, and correlation of the binding energy with patient data such 
as age, gender, race, drug treatment history, and any other pertinent 
information that is available (see, Figure 3). The data generated by this 
computer-based method can be stored in a database, such as, for 
example, in a relational database. The resulting database can be screened 
using searching tools to select potential drugs and therapeutic agents that 
bind to or exhibit biological responses towards target proteins. 
C. Applications of computer-based methods 

As discussed above, the computer-based methods provided herein 
include some or all of the steps of obtaining one or more, preferably two 
or more, amino acid sequences of a target protein that is the product of a 
gene exhibiting genetic polymorphisms; generating 3-dimensional (3-D) 
protein structural variant models from the sequences; and based upon the 
structures of the 3-D models, designing drug candidates or modifying 
existing drugs based on the predicted intermolecular interactions of the 
drug candidates or modified drugs with the structural variants by 
computationally docking drug molecules with the target protein models; 
energetically refining the docked complexes; determining the binding 
interactions between the drug or potential new drug candidate molecules 
and the models by calculating the free energy of binding of the docked 
complexes and decomposing the total free energy of binding based on 
interacting residues in the protein active site or sites deemed important 
for protein activity. There are numerous applications of these methods, 
which include structure-based drug design and drug testing; selection of 
clinically relevant populations for drug testing and other such methods. 
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1 . Genetic polymorphisms and structure-based drug design 

As noted above, structure-based drug design is an increasingly 
useful methodology that has made a great impact in the design of 
biologically active lead compounds. Drug designers can design and 
screen potential new drugs via computational methods, such as docking 
or binding studies, before actually beginning patient testing. The drugs 
designed by such methods, and also those identified by traditional 
methods of drug discovery, are then tested in clinical trials. Among those 
that show efficacy for a particular indication and low toxicity ultimately 
are approved for use. It is found, however, that not all patients with a 
particular indication respond uniformly to the drugs. The drug may not be 
efficacious or side-effects may be pronounced. 

The methods provided herein, represent a further advance in the 
use of rational drug design methods. As described herein, polymorphic 
variation has an effect upon the 3-D structure of encoded proteins. As a 
result, drugs interact with variants differently, leading to differential 
responses in the population as a whole. A new approach to drug design 
and testing is provided herein. This methods involves identifying 
polymorphisms and determining 3-D resulting structures, which are then 
used in methods, including, computational drug design, in the selection of 
patient populations, in designing treatment protocols and in other 
applications. 

2. Drug resistance 

Methods for understanding and overcoming drug resistances by 
using 3-D protein model structures resulting from multiple genetic 
polymorphisms or mutations in an infectious agents, such as viruses, 
bacterial and other pathogenic agents are provided. Also provided are 
methods that for using this information in drug design studies. 

In the case of infectious organisms or other replicating or mutating 
agents, such as flu, HIV, rhinovirus or biological warfare agents, some 
polymorphisms or mutations may arise over time which convey resistance 
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or susceptibility to specific drug therapy, for example, by altering the drug 
target structure or physical properties so that a specific drug or therapy, 
such as an antibiotic or vaccine, may no longer be able to bind to or 
otherwise interact with the target protein to exert its desired biological 
effect. For certain infectious agents, such as HIV, genetic polymorphisms 
in certain genes give rise to drug resistance as the virus mutates (see, 
e.g., Erickson et al. (1996) Annu Rev. Pharmacol. Toxicol. 36:545-571). 

Where drug resistance that arises from mutations or polymorphisms 
is observed, the methods described herein can be used to develop new 
drugs that overcome the resistance. For example, once drug resistance is 
observed, the structure associated with the resistant polymorphism can 
be determined and used in further drug design studies to suggest new 
drugs or modifications to the existing drug that will restore biological 
activity by targeting different mutants or that will target multiple mutants 
simultaneously. 

The model structures can also be used to correlate drug resistance 
in infectious diseases with the structural variants derived from genetic 
polymorphisms. Here, the 3-D structure of the virus or other drug target 
is determined for the particular variant model against which the drug was 
effective. When drug resistance arises due to a genetic polymorphism, a 
model for the structure variant associated with the resistant organism can 
be generated, and a new drug can be designed or modifications can be 
made to the existing drug to overcome the resistance. 

For example, samples of the mutating organism can be obtained 
over time and structural models for the resulting proteins can be 
generated. These models can then be used to design new drug therapies 
that are active against the mutated organism. Multiple drug resistant 
structures can be analyzed to obtain an average structure or to identify 
common structural features in order to design new drugs that have the 
broadest spectrum of activity against multiple mutations. 
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Such structural information is useful in designing effective drug 

therapies to overcome resistance or to develop drugs that are effective 

over a range of genetic polymorphisms and thus work for the maximum 

number of patients. 

3. identification of conserved structural features or 
pharmacophores 

If common structural features are observed over a range of protein 
targets that are derived from genetic polymorphisms, these common 
features may be used to design a drug that is effective with a variety of 
genetic polymorphisms and thus many patients. The retention of certain 
common structural features over a large number of genetic 
polymorphisms suggests that those features may not be mutatable 
because the conserved structure may be essential to protein function, 
e.g., to the viability of an infectious organism or virus. Such conserved 
structural elements are prime targets for structure-based drug design, 
e.g., anti-infective or antibiotic drug design, and can lead to highly 
effective therapies. 

The common structural features can serve as a basis for structure- 
based drug design, for example, by serving as a scaffold for building a 
receptor model into which potential drug candidates can be docked or as 
a pharmacophore query for screening a library of physical or virtual 
chemical or biochemical molecules to identify compounds that match the 
pharmacophore template and, thus, are potential drug candidates. 

Analysis of 3-D protein structural variants derived from genetic 
polymorphisms to identify the common structural features over a large 
number of structural variants can aid in the design of drugs that are active 
over a broad range of genetic polymorphisms, such as in a large number 
of patients or against drug resistant targets. 

In comparing sets of related protein structures, such as those with 
the same biological function or those resulting from genetic 
polymorphisms, certain parts of the structural framework are often found 
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to be conserved, while other parts vary among the proteins. Mutations 
that occur in the conserved regions of the structure can have significant 
effects biological activity. For example, in viruses, the conserved features 
can be essential to protein function and, thus, to the viability of the 
infectious organism or virus. Identifying the conserved structural features 
over a range of structures often gives insight into which structural 
features are necessary for biological activity and are therefore non- 
mutatable. By analyzing a number of structural variants derived from 
genetic polymorphisms that exhibit drug resistance, it is possible to 
identify or design drugs that interact best with the common structural 
features in all of the variants. Using these features in structure-based 
drug design studies leads to the identification of drugs that retain 
biological activity despite multiple mutations, or polymorphisms, and 
could help to overcome the problem of drug resistance. 

In certain preferred embodiments, new potential drug candidates 
can be identified using the structural variant models by identifying 
pharmacophores or conserved features in the protein structural variant 
models and using this structural information to identify small molecules 
that would bind to the structural variant models. 

Using structural comparison tools described herein, the common 
structural features that are conserved across a range of structural variant 
models of a given protein based on different genetic polymorphisms can 
be identified. To do this, multiple structural variant models are compared, 
generally by superimposing the coordinates of one variant model onto 
those of one or more other variants and observing the structural fit. Such 
functionality is commonly found in molecular graphics or homology 
modeling packages. Once the optimum fit of structures is performed, 
then the structural features that are present throughout the structural 
variant models can be identified and used as the basis for drug 
interactions in structure-based drug design studies. For example, the 
pharmacophores or conserved features can be specified as database 
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queries and a library or database of small molecule structures can be 
searched to identify new lead compounds to bind to the pharmacophores. 
Alternatively, other structure-based ligand design strategies can be 
employed to design lead compounds or to identify modifications to be 
made to existing drugs to improve biological activity. 

4. Identification of compensatory structural changes 
Certain proteins, for example, viral proteins or other infectious 
organisms, may harbor multiple genetic polymorphisms. Since each 
genetic polymorphism can give rise to slight changes in structure, some, 
and over time, many, additional genetic polymorphisms may cause 
changes in the protein structures that significantly affect biological 
activity. These structural changes could result in, for example, different 
dynamical behavior, alteration in enzyme kinetics or differences in 
substrate recognition, which can significantly alter drug response. For 
example, a mutation for one drug compound can suppress a mutation to a 
second drug due to compensatory effects. In these cases, a drug which 
is predicted to be ineffective for a given patient based upon the single 
nucleotide correlation may, in fact, be effective as a result of these 
changes. 

Because mutations are so frequent in AIDS and other viruses, few 
sequences are exactly the same in different patients. Thus, it is difficult 
or inconclusive to generate multiple mutation sequence correlations for 
drug resistance. If each patient has a different viral sequence due to a 
high viral mutation rate, then no sequence correlation is even possible in 
such cases. 

The methods described herein can be used to study the effects of 
multiple genetic polymorphisms on a resultant protein structure. Multiple 
mutations are common in AIDS and other viruses, which makes sequence 
correlation difficult. By observing the structural effects of the mutations 
on the resulting protein, it is possible to look at the net effect of all 
structural changes and to consider the overall structure of the protein in 
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drug design studies. For example, a mutation might occur in the active 
site, or site of drug action, in a protein. Additionally, there may be related 
mutations in other parts of the protein structure, which might not be 
identified from a single point mutation correlation. These related 
mutations could have an effect on biological activity of the protein. By 
looking only at the active site, it might be predicted that a drug or 
potential drug would not bind to the protein. The additional mutation, 
however, might cause compensatory structural changes in the protein 
structure that alter its properties in a way that restores biological activity. 

By computing 3-D protein structures from gene sequences 
containing multiple polymorphisms, it is possible to more accurately 
predict the effect of multiple sequence mutations on protein structure 
and, thus, to obtain a better correlation between sequence and drug 
resistance than by considering sequence correlations alone. This 
information can be useful, for example, in understanding drug resistance 
and can aid researchers and clinicians in developing new drug therapies to 
overcome drug resistance. 

The structures that are derived based on multiple generic 
polymorphisms can be used in structure-based drug design studies to 
provide frameworks, or scaffolds, into which drug or potential drug 
molecules can be docked. This permits the design of drugs that are 
active against a wider range of structural variants, thus, in more patients 
or against a range of drug resistant proteins. 

5. Clinical Applications 

A knowledge of the repertoire of structural differences arising from 
genetic polymorphisms across the human population or specific 
subpopulations can provide insight into the differing biological responses 
in patients based on their genetic differences. For example, where clinical 
data are available for patients having particular genetic polymorphisms, 
this information can be associated with the 3-D protein structural variants 
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and used to find correlations between polymorphisms and observed drug 
responses. 

The methods provided herein can be used to design drug therapies 
that bring about favorable clinical responses (or eliminate unfavorable 
effects) in patients, to identify pharmacological effects of drugs in 
different patient subpopulations (e.g. age, race, gender) and to simulate 
clinical trails to increase the probability that the trials will yield optimal 
results. 

Because of the high cost of clinical trials, such studies are generally 
focused on small patient populations. The structural analysis tools 
described herein permit the extension of clinical trials to cover patient 
populations not specifically included in the study. This is accomplished 
through correlation of the structural variants derived from genetic 
polymorphisms with clinical responses. 

The molecular structures and databases described herein can also 
find application in the understanding and prediction of clinical or 
pharmacological drug responses, for example, efficacy, toxicity, dose 
dependencies or side effects in patients. For example, relational 
databases containing 3-D protein structural variants can provide a means 
for managing and using the information to understand and predict clinical 
responses in patients. 

In other embodiments, observed clinical data from patients in a 
clinical trial can be associated with the structural variant models for each 
genetic polymorphism exhibited in the clinical subjects, for example, in a 
structural polymorphism relational database. The correlation between the 
structural variants and observed clinical effects can then be utilized to 
predict clinical outcomes in patients that did not participate in the clinical 
trial. For example, a structural variant model can be generated for a 
patient based on a genetic polymorphism exhibited in the patient, and the 
database can be mined to identify structurally similar variants for which 
clinical results are known. Structural similarity can be determined, for 
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example, by superimposing the structures and measuring the RMS (root 
mean squared) differences between the structures or by using pattern 
matching or motif searching algorithms. The results can be used to 
predict clinical responses in the patient based on the clinical data 
associated with the structurally similar variants. 

The predicted correlations can also be used to aid in the design of 
subsequent clinical trials. The follow-on trials can be made more effective 
through the judicious selection of patients with given genotypes ( i.e. , 
those exhibiting the same genetic polymorphisms), as guided by the 
structurally predicted outcomes. For example, a clinical trial can be 
designed based on a subpopulation of clinical subjects which exhibit a 
specific genetic polymorphism (he. structural variant) to demonstrate the 
effectiveness of a given therapeutic on a targeted population. 

In other embodiments, the methods provided herein can be used in 
the selection of drug therapies for patients exhibiting a particular genetic 
polymorphism. This is accomplished by generating the structural variant 
model associated with the polymorphism, docking drug molecules that 
might be used to treat the patient into the structural variant model and 
calculating the binding energies of each drug with the variant. The results 
of docking or free energy calculations can be correlated to clinical data, 
for example, patient population (e.g., ethnic background, race, sex, age), 
treatment regimen, patient response to a particular drug or duration of 
treatment. The binding energies can be compared, for example, to 
determine which drug would best bind to the variant in order to identify 
the drug that could best be used to treat the patient to optimize biological 
activity. 

D. Creation of 3-D Structural Polymorphism Databases 

The above-noted methods all rely upon the use of databases of 
nucleic acid sequences. Any such database known to those of skill in the 
art may be employed; numerous such databases are publically available 
(e.g. the Stanford HIV database). The Stanford HIV database is hierarchal 



-49- 



24737-1906C 



database with information about HIV patients who received or did not 
receive protease inhibitor treatments, patient-dates, isolates, sequences, 
hyperlinks to MEDLINE and GenBank abstracts, and art. This database, 
however, does not contain 3-D protein structures of any proteins 
including HIV reverse transcriptase (RT) and HIV protease (PR; see, e.g., 
Shafer et al. (1999) Nucleic Acids Res. 27:348-352, Shafer et al. (1999) 
J. Virol 73:6197-6202, http://hivdb.stanford.edu/hiv, Richter (January 
20, 1999) "AIDS drugs found to be effective in the world's most common 
HIV strains). 

Databases of sequences and associated information may also be 
generated as described herein by obtaining samples and sequences from a 
variety of sources. In all instances, further databases are generated by 
then calulating 3-D structural models of the encoded proteins or relevant 
portions, such as active binding sites, thereof, from the nucleic acid 
sequence information. It is these databases of nucleic acid sequence 
and/or primary protein sequence and the associated 3-D structure that are 
provided herein and that are used in the all of the methods, except for the 
computational phenotyping discussed below, which does not require a 
database, provided herein. Hence databases comtaining computationally 
determined 3-D structures of polymorphic proteins or portions thereof are 
provided herein. These databases serve as tools in a variety of methods, 
including those provided herein. 

Databases that include 3-D structures for variant proteins encoded 
by the nucleic acids that contain polymorphisms are provided. These are 
generated after 3-D structural models are constructed for the protein 
structural variants, preferably for all of the protein structural variants, 
representing the genetic polymorphisms, by inputting the atomic 
coordinates into a structural polymorphism database, preferably a 
relational database, and optionally with associated structural and/or 
physical properties (e.g., phi/psi and side-chain angles and energetics), 
and other data, if available, including, but are not limited to, historical 
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data, such as parental medical histories, and clinical data. The resulting 
database is used in structure-based drug design studies and for clinical 
analyses. Figure 1 1 is a tabulation of the 3-D coordinates of a 
representative entry, an HIV protease, that is encoded by the DNA in one 
of SEQ ID Nos. 3-74 and 77-1 17, and that is an entry in an exemplary 
database that includes 3-D structures. Exemplary databases that contain 
the nucleic acids sequences and structures of all proteins encoded by 
SEQ ID Nos. 3-1 17 as well additional nucleic acids are provided herein 
and are described in the EXAMPLES. 

A database is preferably interfaced to a molecular graphics package 
that includes 3-D visualization and structural analysis tools, to analyze 
similarities and variations in the protein structural variant models (see, 
copending U.S. application Serial No. 09/531,995, which is published as 
International PCT application No. WO 00/57309, and is a continuation-in- 
part of U.S. application Serial No. 09/272,814, filed March 19, 1999). 
Briefly, International PCT application No. WO 00/57309 provides a 
database and interface for access to 3-D molecular structures and 
associated properties, which can be used to facilitate the design of 
potential new therapeutics. The interface also provides access to other 
structure-based drug discovery tools and to other databases, such as 
databases of chemical structures, including fine chemical or combinatorial 
libraries, for use in structure-focused high-throughput screening, as well 
as to a host of public domain databases and bioinformatics sites. The 
interface also provides access to other structure-based drug discovery 
tools and to other databases, such as databases of chemical structures, 
including fine chemical or combinatorial libraries, for use in structure- 
focused high-throughput screening, as well as to a host of public domain 
databases and bioinformatics sites. This interface can be modified as 
needed to adapt for use with a paritcular database. 

A relational database that collects multiple data files relating to the 
same molecular structure in the same subdirectory and that provides an 
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interface to access all of the collected files from the same structure using 
the same user interface program is also provided. The collected files 
include a variety of information and computer file formats, depending on 
the type of information to be conveyed to users of the database. In 
practice, a user communicates over a public network, such as the 
Internet, or over a controlled network, such as an internet, with a secure 
file server that controls access to the collected files, and the interface to 
the collected files is provided by a standard graphical user interface 
program that is widely available. In this way, a convenient means of 
searching molecular structure data for characteristics of interest is 
provided. Data searching, file viewing, and investigation of multiple 
representations of molecular structures from within a single viewing 
program can also be performed using the database and interface. 

The data files can be those available over a wide network such as 
the Internet, and a suitable graphical user interface designed or obtained. 
Such interface is used for viewing the data files is a standard Internet 
web browser program, such as the web browser products by Netscape 
Communications, Inc. and Microsoft Corporation that are distributed free 
of charge. Such browser products readily import and provide views of 
files having a wide variety of formats that contain alphanumeric, video, 
and audio data. A security server is preferably located between the user 
browser program at a network client machine controls access to the 
database, which is housed at a file server connected to the security 
server. Before a user gains access to the database, the security server 
checks authorization for the individual user and then, if appropriate, 
permits downloading of appropriate data from the database file server. It 
is contemplated that the databases containing 3-D structures of proteins 
or portions thereof the exhibit polymorphism will be loaded. 

Data for a molecular structure is loaded into the database by 
specifying the file pathnames for the various data files that contain the 
different types of data, including the different molecule views. Using a 
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browser to view the data files permits various helper applications, called 
plug-ins, to smoothly and transparently accept the different file formats 
and provide views to the user. The various data files of the database are 
organized in accordance with the database design when they are loaded 
into the database and are managed by a relational database management 
program. 

In addition to 3-D protein structures and associate primary 
sequences, as provided herein, the database can optionally contain 
associated biological or clinical data, such as drug resistance, side 
effects, efficacy, pharmacokinetics and other data, that correlate with or 
can be correlated the structural variants. This information will be used for 
correlating observed clinical effects to specific structural variants and for 
predicting clinical responses and outcomes based on a patient's structural 
variants, i.e., genetic polymorphisms. 

Structural analysis tools are preferably integrated with the 
structural database for comparing and analyzing the resulting protein 
structural variant models. For example, the molecular graphics software 
package described in International PCT application No. WO 00/57309, 
includes structural analysis capability to measure the structural attributes 
of the model (distances, angles, etc.), to analyze sequences and 
secondary structures, to study physical properties such as 
hydrophobicity, electrostatic potential, and active or reactive sites in the 
protein, as well as to evaluate the quality of the structure (both 
conformationally and energetically). 

Structures can also be compared by aligning them, such as by 
performing a least squares fitting of the x-, y- and z-coordinates of each 
of the structural variant models and superimposing the structures or any 
other alignment method or structural comparison method. For example, 
the structures of the variants can be clustered, or grouped together, 
based on structural similarity. This can save time over studying each 
structural variant independently because, where structures are considered 
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to be similar enough that they are clustered together (e.g., if their 
structures can be superimposed within a specified tolerance), then only a 
representative structure, or perhaps an average structure or scaffold, 
which is derived as a composite of the individual structural variant 
models, can be used in further drug design studies. 

Tools for database searching can also be included in the software 
package. These can be used to query the database for structural variant 
models having similar properties, such as molecular structure or sequence 
similarity. These tools are used, for example, to mine the database to 
identify variant models that are structurally similar (e.g. to find structures 
that overlap within a specified tolerance), and thus would be predicted to 
interact in the same way with potential drugs or exhibit the same clinical 
response. This information could be useful in understanding the 
structural or clinical effects of different genetic polymorphisms and could 
potentially save time and money by extending the results of previously 
performed clinical or computer-based drug design studies to predict the 
results of studies on similar structural variants that have not yet been 
performed. 

1 . Exemplary Databases 

Databases containing data representative of the 3-D structure of 
structural variants encoded by a selected gene or genes or the 3-D 
structure of other polymorphic variants are provided. The selected genes 
can be drug target, such as receptors and genes of infectious agents, 
such as the HIV protease or reverse transcriptase. Exemplary databases 
are presented in Example 5 which describes the construction, interface, 
use and appliations of HIV PR and RT databases. These databases may 
be stored on any suitable medium and used in any suitable computer 
system. Systems and methods for generating, storing and processing 
databases are well known. 

2. Computer systems 
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Computer systems for processing the databases and computer 
systems containing the databases are provided. The processing that 
maintains the database and performs the methods and procedures using 
the databases may be performed on multiple computers, or may be 
performed by a single, integrated computer. For example, the computer 
through which data is added to the database may be separate from the 
computer through which the database is sorted or analyzed, or may be 
integrated with it. Each computer operates under control of a central 
processor unit (CPU), such as a "Pentium" microprocessor and associated 
integrated circuit chips, available from Intel Corporation of Santa Clara, 
California, USA. A computer user can input commands and data from a 
keyboard and display mouse and can view inputs and computer output at 
a display. The display is typically a video monitor or flat panel display 
device. The computer also includes a direct access storage device 
(DASD), such as a fixed hard disk drive. The memory typically includes 
volatile semiconductor random access memory (RAM). Each computer 
preferably includes a program product reader that accepts a program 
product storage device from which the program product reader can read 
data (and to which it can optionally write data). The program product 
reader can include, for example, a disk drive, and the program product 
storage device can comprise removable storage media such as a magnetic 
floppy disk, an optical CD-ROM disc, a CD-R disc, a CD-RW disc, or a 
DVD data disc. If desired, computers can be connected so they can 
communicate with each other, and with other connected computers, over 
a network. Each computer can communicate with the other connected 
computers over the network through a network interface (see, e.g., 
Examples below) that permits communication over a connection between 
the network and the computer. 

The computer operates under control of programming steps that 
are temporarily stored in the memory in accordance with conventional 
computer construction. When the programming steps are executed by 
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the CPU, the pertinent system components perform their respective 
functions. Thus, the programming steps implement the functionality of 
the system as described above. The programming steps can be received 
from the DASD, through the program product reader, or through the 
network connection. The storage drive can receive a program product, 
read programming steps recorded thereon, and transfer the programming 
steps into the memory for execution by the CPU. As noted above, the 
program product storage device can include any one of multiple 
removable media having recorded computer-readable instructions, 
including magnetic floppy disks and CD-ROM storage discs. Other 
suitable program product storage devices can include magnetic tape and 
semiconductor memory chips. In this way, the processing steps 
necessary for operation can be embodied on a program product. 

Alternatively, the program steps can be received into the operating 
memory over the network. In the network method, the computer receives 
data including program steps into the memory through the network 
interface after network communication has been established over the 
network connection by well known methods that will be understood by 
those skilled in the art without further explanation. 

The computer that implements the client side processing, and the 
computer that implements the server side processing, or any other 
computer device of the system, may comprise any conventional computer 
suitable for implementing the functionality described herein. FIGURE 9 is 
a block diagram of an exemplary computer device 900 such as might 
comprise any of the computing devices in the system. Each computer 
operates under control of a central processor unit (CPU) 902, such as an 
application specific integrated circuit (ASIC) from a number of vendors, or a 
"Pentium"-class microprocessor and associated integrated circuit chips, 
available from Intel Corporation of Santa Clara, California, USA. Commands 
and data can be input from a user control panel, remote control device, or a 
keyboard and mouse combination 904 and inputs and output can be viewed 
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at a display 906. The display is typically a video monitor or flat panel display 
device. 

The computer device 900 may comprise a personal computer or, in 
the case of a client machine, the computer device may comprise a Web 
appliance or other suitable Web-enabled device for viewing Web pages. In 
the case of a personal computer, the device 900 preferably includes a direct 
access storage device (DASD) 908, such as a fixed hard disk drive (HDD). 
The memory 910 typically comprises volatile semiconductor random access 
memory (RAM). If the computer device 900 is a personal computer, it 
preferably includes a program product reader 912 that accepts a program 
product storage device 914, from which the program product reader can 
read data (and to which it can optionally write data). The program product 
reader can comprise, for example, a disk drive, and the program product 
storage device can comprise removable storage media such as a floppy disk, 
an optical CD-ROM disc, a CD-R disc, a CD-RW disc, a DVD disk, or the like. 
Semiconductor memory devices for data storage and corresponding readers 
may also be used. The computer device 900 can communicate with the 
other connected computers over a network 916 (such as the Internet) 
through a network interface 918 that enables communication over a 
connection 920 between the network and the computer device. 

The CPU 902 operates under control of programming steps that are 
temporarily stored in the memory 910 of the computer 900. When the 
programming steps are executed, the pertinent system component performs 
its functions. Thus, the programming steps implement the functionality of 
the system illustrated in FIGURE 1 . The programming steps can be received 
from the DASD 908, through the program product 914, or through the 
network connection 920, or can be incorporated into an ASIC as part of the 
production process for the computer device. If the computer device includes 
a storage drive 912, then it can receive a program product, read 
programming steps recorded thereon, and transfer the programming steps 
into the memory 910 for execution by the CPU 902. As noted above, the 
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program product storage device can comprise any one of multiple removable 
media having recorded computer-readable instructions, including magnetic 
floppy disks, CD-ROM, and DVD storage discs. Other suitable program 
product storage devices can include magnetic tape and semiconductor 
memory chips. In this way, the processing steps necessary for operation in 
accord with the methods herein can be embodied on a program product. 

Alternatively, the program steps can be received into the operating 
memory 910 over the network 91 6. In the network method, the computer 
receives data including program steps into the memory 910 through the 
network interface 918 after network communication has been established 
over the network connection 920 by well-known methods that will be 
understood by those skilled in the art without further explanation. The 
program steps are then executed by the CPU 902 to implement the 
processing of the system. 

To implement the functionality described herein, it has been found 
that a suitable computer for performing database server tasks includes a 
"Pentium" level CPU having at least 128 MB of memory, 30 GB of disk 
storage, and 256 MB of disk swap space for files. A recommended 
configuration for computer performance would include, for example, a 
"Pentium III" processor at 700 MHz or faster, memory of 256 MB or 
greater, disk storage space of 50 GB or more, and swap space of 500 MB 
or more. A suitable configuration for performing user tasks as described 
above includes a "Pentium" level CPU having 128 MB memory, disk 
space of 240 MB with swap space of 256 MB, and an optional display 
circuit card supporting OpenGL and having 4 MB of memory. A 
recommended configuration includes, for example, a "Pentium III" 
processor at 500 MHz or faster, memory of 256 MB or greater, disk 
space of 500 MB or more, swap space of 500 MB or more, and an 
optional display card having 8 MB of memory or more, supporting 
resolution of 1024 x 768. 
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In a preferred embodiment, the software used in the computing 
system described above includes, for the server machine, operating 
system software such as "Windows NT Server 4.0" from Microsoft 
Corporation, with Service Pack 5, Version 1280 (10 June 1999) or more 
recent, with database management server software such as, but are not 
limited to, "Oracle Server Standard Edition 8.1" from Oracle Corporation. 
The software used in a preferred embodiment of the user machine 
includes operating system software such as "Windows NT Workstation 
4.0" from Microsoft Corporation, with Service Pack 5, version 1280 (10 
June 1999) or more recent, as well as "Oracle Client Standard Edition 
Version 8.1 " or higher. The client machine will also be compliant with 
the "Java" programming language (Java Runtime Environment 1.2.2). As 
will be known to those skilled in the art, other configurations may be 
suitable, depending on the applications being used and the computer 
performance desired. 
E. Computational phenotyping 

Also provided herein is a method designated computational 
phenotyping. Computational (also referred to herein as in silico 
phenotyping). This refers to the method in which a 3-D protein structure 
is generated from a given genotype and protein-drug binding analyses in 
silico (computationally) are performed in order to determine whether drug 
binding does (i.e. sensitive) or does not (i.e. resistant) take place. This 
type of analysis is contemplated to be performed for an individual patient 
or subject or groups thereof, such as ethnic groups, gender-based or age- 
based groups, particular species or groups thereof) to assess or select a 
drug for treatment of a particular disease or other such use, and is done 
to assess efficacy of a particular drug on a desired target, where the 
target exhibits polymorphisms. The following discussion and example, 
below, is with reference to HIV PR and RT 7 but it is understood that the 
methods and applications can be applied to any protein or gene product 
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that exhibits polymorphic variation, and particularly to gene products that 
are drug targets. 

Among the methods of computational phenotyping, there are three 
distinct methodologies that are clinically useful for determining either 
resistance or sensitivity to particular HIV-1 antiviral therapeutics. These 
are: genotyping, phenotyping, and virtual phenotyping. These 
methodologies are used to optimize the choice of therapeutics during the 
initiation of therapy, after drug failure, and/or during salvage therapy. 
Genotyping involves extracting the HIV viral RNA and amplifying all or 
part of the genes encoding the protease and reverse transcriptase 
proteins and sequencing them in order to assess the presence of 
resistance-associated mutations. 

In phenotyping, the amplified sequences are instead sub-cloned into 
expression vectors and then tested for their replicative ability in vitro by 
transfecting them into cultured and/or established cell lines, such as, for 
example, human T cells, monocytes, macrophage, dendritic cells, 
Langerhans cells, hematopoeitic stem cells, HeLa, XC, Mm5MT, LTL, 
COS 7, NIH3T3, LTA, MCF-7, or other cells derived from human tissues 
and cells that which are the principal targets of viral infection in the 
presence or absence of antiviral drugs {see, e.g. , U.S. Patent No. 
5,837,464; see, also EP 0852626; EP 1012334; and EP 0877937), 
Virtual phenotyping (ViroLogic, Inc.) is an interpretive service in which the 
phenotype of a specimen (i.e. of a plant, animal, pathogen, or human) is 
inferred from the specimen's genotype based upon an extensive 
correlative database of known genotypes and phenotypes. Such a 
correlative database must be updated constantly to maintain clinical 
accuracy. 

Similar to virtual phenotyping, computational or in silico 
phenotyping infers phenotype based upon specimen genotype. Computa- 
tional phenotyping is distinct from virtual phenotyping in that sensitivity 
or resistance to drugs is determined directly through protein-drug binding 
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analysis performed in silico and not through correlation with a database of 
known genotypes and phenotypes. The advantage of computational 
phenotyping is that new resistance conferring mutations can be 
discovered rapidly and in "real time" without the need for phenotyping to 
train the genotype. Moreover, in silico phenotypes are not subject to 
error caused from compensatory mutations which may act synergistically 
or anti-synergistically with resistance-associated mutations to increase, 
decrease, or reverse specific drug resistances. Computational 
phenotyping will generate information that can, for example, be presented 
in a report that is marketed within the in vitro diagnostics industry as an 
adjunct test/service to help optimize therapy and assist physicians, 
farmers, acadmenic institutions, government agencies, and industries with 
specimen treatment. Thus, a computer-based method for predicting 
clinical responses e.g. drug sensitivity or drug resistance in patients, 
plants, animals, pathogens, and microorganisms based on genetic 
polymorphisms is provided. 

The genotypes used in the methods are obtained from any source, 
including, but are not limited to, from a plant, animal, pathogen, or 
mammal with the most preferred source being a mammal, paticularly a 
human for whom a particular drug treatment is contemplated, and is the 
genotype of the drug target, such as, as exemplified herein, HIV RT or PR 
from a particular infected individual. Other examplary drug targets are 
proteins, polypeptides, oligopeptides, including, but not limited to, a 
receptor, enzyme, hormone, and any such compound with which drugs or 
other ligands interact to bring about a biological response. For 
exemplification of this method, the protein considered is an enzyme, in 
particular HIV protease (PR) and reverse transcriptase (RT), which are 
therapeutic drug targets. Nucleic acid encoding the target from 
individual sample, such as blood sample or other body fluid sample from a 
mammal, such as a human patient, is sequenced, and the 3-D structure 
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thereof determined. The drug of interest is computationally tested to 
assess whether it interacts with the sample. 

The following examples are included for illustrative purposes only 
and are not intended to limit the scope of the invention. 

EXAMPLE 1 

BINDING CORRELATIONS OF MUTANT FORMS OF HCV PROTEASE 
WITH DIFFERENT INHIBITORS 

This example provides the results of a theoretical study of NS3 

protease complexes with two known peptide inhibitors (see SEQ ID Nos. 

1 and 2; Ingallinella et al. {(1998) Biochemistry 37:8906-891 4). 

Introduction 

During HCV replication, the final steps of processing are performed 
by a virially encoded chymotrypsin-like serine protease NS3. NS3 is an 
approximately 3000 amino acid protein that contains, from the amino 
terminus to the carboxy terminus, a nucleocapsid protein (C), envelope 
proteins (E1 and E2) and several non-structural proteins (NS1, 2, 3, 4a, 
4b, 5a and 5b). NS3 is an approximately 68 kDa protein, encoded by 
approximately 1893 nucleotides of the HCV genome, and has two distinct 
domains: (a) a serine protease domain containing approximately 200 of 
the N-terminal amino acids; and (b) an RNA-dependent ATPase domain at 
the C-terminus of the protein. The NS3 protease is considered a member 
of the chymotrypsin family and is a serine protease that is responsible for 
proteolysis of the polypeptide (polyprotein) at the NS3/NS4a, NS4a/NS4b, 
NS4b/NS5a and NS5a/NS5b junctions responsible for generating four viral 
proteins during viral replication. This protease is inhibited by N-terminal 
cleavage products of substrate peptides. The NS3 protease, which is 
necessary for polypeptide processing and viral replication has been 
identified, cloned and expressed (see, e.g. , U.S. Patent No. 5,712,145). 

Active NS3 forms a heterodimer with a polypeptide cofactor NS4A. 
The crystal structure of NS3 with and without the NS4A cofactor is 
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known (see, e.g., Love et al. (1996) Cell 57:331 -342; Habuka eta/. 

(1997) Jikken Igaku 75:2308-2313; Yan et al. (1998) Protein Sci. 
7:837-847, which provides the structure with NS4A). 

The NS3 protease is a target for design of antiviral drugs. For 
example, a series of potent hexapeptide inhibitors of NS3 has been 
developed by optimization of the product inhibitors (Ingallinella et al. 

(1998) Biochemistry 37:8906-8914). 

Analyses 

Models of the complexes of NS3 with the two protease inhibitor 
peptides were obtained by flexible docking of the peptides into the active 
site of the crystal structure of NS3/4A, followed by evaluation of protein- 
peptide binding energies. The models were tested by in situ modification 
of the docked ligands. A qualitative agreement between the binding 
energies and inhibitor IC 50 values obtained from literature was found. 
The peptides studied were: 

Sequence* IC 50 f nM SEQ ID 

Ac-Asp 1 -D-Glu 2 -Leu 3 -lle 4 -Cha 5 -Cys 6 -COO- 1 5 1 

Ac-Asp 1 -L-G!u 2 -Leu 3 -lle 4 -Cha 5 -Cys 6 -COO- 60 2 

* Cha = /?-cyclohexylalanine 

In the modeling studies, it was assumed that: 

the high-affinity inhibitory peptides 1 and 2 have a similar mode of 
binding to the active site of NS3; 

the minimum binding pharmacophore includes the SH group of Cys" 
and carboxyl groups of Asp 1 , Glu 2 and Cys 6 ; and 

the side chains of residues 3, 4 and 5 may enhance binding by 
non-specific hydrophobic interaction with NS3. 
Methods 

Initial structure of the NS3-peptide complex 

The crystal structure of NS3 with a peptide cofactor NS4A was 
obtained from the arts (Kim et al. (1996) Cell 87:343) and was used in 
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the studies with peptide inhibitors. The crystal structure of NS3/NS4A 
was regularized using molecular mechanics described herein. Initial NS3- 
NS4-peptide complexes were constructed by placing the peptides into the 
NS3 binding site expected by structural homology to by other serine 
proteases: 

the C-terminal carboxyl was placed near the oxyanion-stabilizing 
site (residues 137-139); 

the side chain of Cys 6 was inserted into the hydrophobic cavity 
formed by L135, F154 and A157; and 

the €-amino group of K136 was placed in contact with the C- 
terminal carboxyl (see, Kim et al. (1996) Cell 87:343, Steinkuhler et at. 
(1998) Biochemistry 37:8899). 

Monte Carlo simulations 

In order to optimize the complexes, Biased Based Probability Monte 
Carlo (BPMC) simulations (Abagyan et al. (1994) J. Mol. Biol. 235:983) 
were performed on the NS3-peptide complexes using the ICM program 
(commercially available from MolSoft, San Diego, CA) with ECEPP/3 force 
field and atomic solvation energies (Momany et al. (1975) J. Phys. Chem. 
79:2361, Nemethy et al. (1992) J. Phys. Chem. 96:6472, Abagyan et al. 
(1997) Computer Simulations of Biomedical Systems: Theoretical and 
Experimental Applications, vol. 3, Kluwer Academic Publishers, 
Dordrecht, The Netherlands, p. 363). The sampling method was BPMC 
with random change of one variable at a time. A Metropolis acceptance 
criterion was applied after energy minimization (quasi-Newton, up to 1000 
steps). Simulations were performed at a temperature of 1000° K. The 
peptide translational and rotational degrees of freedom, all peptide torsion 
angles and x angles of the protein side-chains located within 7.0 A of any 
peptide atom were varied during the BPMC simulations. 
The energy function used in the MC simulations included: 

ECEPP/3 terms for energy in vacuo (VDW (van der Waals), H-bond, 
electrostatic and torsion potentials); 
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distance dependent electrostatics with e 0 = 4.0; and 

surface energy with atomic solvation parameters. 

The total energies of the complexes were calculated including 
contributions from: ECEPP/3 VDW, H-bond, S-S bond and torsion terms; 
exact-boundary electrostatic energy with e 0 = 8.0; and side-chain 
entropies. Hydrophobic free energies were estimated as sA, where A is 
accessible surface area and s is a tension constant of 0.03 kcal/molA 2 . 

Strategy of the flexible Monte Carlo docking 

The simulations proceeded with multiple, relatively short MC runs 
(2000-5000 generated structures). New docking cycles were started 
from the lowest-energy or other interesting structures found in previous 
runs. Structures saved during various MC runs were sorted by total 
energies and RMSD (root-mean-squared deviation), and compressed into a 
cumulative conformational stack. Binding energies were calculated for 
representative structures of each complex thus obtained. This strategy 
was more efficient than continuous long simulations because the variable 
torsion angles and distance constraints are defined for an initial structure 
and do not change during the MC run. 

Binding energies of the peptide-protein complexes 
For low-energy conformations found after several iterative BMPC 
cycles, peptide-protein binding energies were estimated using the 
equation: 

^faind ^compl ~ Ep ept - E prot , 

where E comp , is the energy of the complex, E pept & E prot are separate 
energies of the peptide and protein, respectively, and E Q is an adjustable 
constant. 

The binding energy function included: exact-boundary electrostatic 
free energy contributions; side-chain entropy; and surface tension 
hydrophobic free energy terms. (Zhou and Abagyan (1998) Folding 
Design 3:513, Schapira eta/. (1999) J. Mol. Recognition 12:177). 
ECEPP/3 hydrogen-bonding terms were included with a weight of 0.5. 
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Results 

Models of the NS3-peptide complexes 

RMSD between pharmacophore atoms of peptides 1 and 2 were 
calculated for all pairs of BPMC structures. Two models of the NS3- 
peptide complexes were selected assuming (1) similar positions of 
pharmacophore groups of two peptides in the binding site (RMSD < 2.0 
A) and (2) low binding energy of the complexes (AE b!nd < 5.0 kcal/mol). 
Two models of the NS3-peptide complex were selected by visual 
inspection. 

Characteristics of the binding sites for peptide inhibitors in two 
NS3-peptide complex models are summarized in Table 1. 



Table 1 



site 


Peptide 


NS3 residue, group 


Type of 


Present for Peptide 




residue 




interaction 


Model 1 


Model 2 


PI 


Cys 6 COCr 


K136 NH 3 + 


H-bond/el. 


1,2 


1,2 






G137 NH 


H-bond 


1,2 


2 






S139 OH 


H-bond 


1,2 


2 




Cys 6 SH 


L135, F154, A157 


hydroph 


1,2 


1,2 


P2 


Cha 5 


H57, R155, A156 


hydroph 


1,2 








A157, V158 


hydroph 




2 


P3 


He 4 


V132, S133 


hydroph 


1,2 


2 






V158, C159 


hydroph 




1 


P4 


Leu 3 


Res. 1 57 to 1 60 


hydroph 


1,2 


2 






V132, S133 


hydroph 




1 


P5 


Glu 2 COO- 


R161 guanidine 


H-bond/el. 




1,2 


P6 


Asp 1 COO- 


R161 guanidine 


H-bond/el. 


1,2 








S133 OH 


H-bond 




1,2 
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Validation of the models: modifications of the protein and ligands 
in the binding site 

In order to validate the proposed models, the K136M mutation and 
peptide modifications known from SAR (structure-activity relationship) 
studies were performed in low-energy structures of the NS3-peptide 2 
complex. 

Positions of the modified ligand and conformations of adjacent 
protein side chains were adjusted by energy minimization. Distance 
restraints were applied to keep the ligand near its initial position. 

Changes in calculated binding energies upon modifications, AE bjnd 
(calc), were compared to the values expected from ratios of inhibitory 
potencies, AE bind (exp). 

AE bind (exp) = RT/A?(IC 50 mod /IC 50 °), 
where IC 50 ° and IC 50 mod are inhibitory potencies of the parent and modified 
compounds. 

The correlation between experimental and calculated changes in 
binding energy upon ligand modifications in the binding site of NS3 is 
illustrated in 
FIG. 4. 
Discussion 

The two NS3-peptide complex models suggest a common binding 
pattern for the inhibitor P1 site (Cys 6 -OH) with the carboxyl group 
hydrogen-bonded to the oxyanion hole residues G137 and S139, and the 
Cys 6 side chain embedded in a hydrophobic pocket formed by L135, F154 
and A157. 

This study confirms the possibility of hydrogen bonding between 
the C-terminal carboxyl and e-amino group of K136 suggested by 
Steinkuhler eta/. {(1998) Biochemistry 37:8899) based on the K136M 
mutation in NS3. Changes in calculated binding energies upon mutation 
are consistent with an 8-fold increase in of an inhibitor with a free 



-67- 



24737-1906C 



carboxyl group and with the lack of an effect on binding when the peptide 
is amidated. 

The models differ in binding of the negatively charged side chains 
in positions P5 and P6. The R161 guanidine interacts with a carboxyl 
group of Asp 1 and Glu 2 in Models 1 and 2, respectively. In Model 2, the 
Asp 1 carboxyl also interacts with the hydroxyl of S133. 

The models are in agreement with SAR data for peptide inhibitors 
of NS3. Predicted changes in binding energy upon modification of the 
protein and peptides correlate reasonably well with the changes expected 
from IC 5 ° ratios. Standard deviations of AE bind (calc) - AE bind (exp) were 0.8 
and 1 .6 kcal/moi for Models 1 and 2, respectively, with correlation 
coefficients of 0.62. After the largest outlier was removed from each 
dataset, correlations improved to 0.81 and 0.76, respectively. 
Conclusions 

An effective iterative Biased Probability Monte Carlo protocol for 
the docking of flexible peptide ligands into a flexible protein active site 
has been developed. Two models of the complexes of HCV NS3 protease 
with potent peptide inhibitors were proposed based on the docking 
simulations and on evaluation of protein-ligand binding energies. The 
models were validated by in situ modifications of NS3-peptide complexes 
and by correlation of binding energies of modified complexes with those 
expected from experimental IC 50 values. Proposed models can be used 
for planning further mutagenesis studies of the HCV NS3 protease and 
the models can be used in the design of non-peptide inhibitors using 
structure-based drug design methodologies. 

EXAMPLE 2 

LEAD OPTIMIZATION BY RECEPTOR-BASED FREE ENERGY 
QUANTITATIVE STRUCTURE ACTIVITY RELATIONSHIPS (QSARS) FOR 
TNF RECEPTOR ANTAGONIST DISCOVERY 

The goal of the modeling studies in this phase was to identify 
binding modes and complex structures of the compounds that bind to 
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TNF receptor type I protein in order to guide the design of new 
compounds. An approach that relies on docking compounds to the 
receptor, evaluating free energy changes of binding of the docked 
structures, and comparing the calculated values with experimental 
inhibition constants K s of the compounds was developed. The success of 
the calculations was assessed by evaluating the consistency of the 
calculated free energy changes of binding and the experimental K,. 

The difference in free energy changes of binding between two 
compounds with inhibition constants K, and K,' can be calculated as, 

AA G = -kT lnK t 7K, 
where k and T are Boltzmann's constant and absolute temperature, 
respectively. 

The 13 active compounds were studied. Their potencies, as 
measured by K j7 range from 0.1 to 30//M, spanning about 3 kcal/mol in 
free energy. It was found that the calculated free energy changes of 
binding are highly consistent with the corresponding experimental values, 
with correlation coefficient 0.966 and difference less than 0.5 kcal/mol 
(see Table 2 and Figure 4). The predicted binding modes and complex 
structures can thus be accepted with confidence. 

To modify these compounds, important pharmacophore features on 
the surface of the receptor that are critical for binding of the compounds 
were identified. These features include a hydrophobic belt, a hydrophilic 
belt and 3 hydrogen bond donor sites. A few of potential hydrogen 
bonding sites, which are not used by the current compounds, were also 
derived, and can be used for designing more potent binders. 

Graphics-guided redesign of the compounds was performed. The 
free energy calculation was used to predict the binding activity of each 
design. Fourteen new compounds were thus designed and binding 
activities were predicted. The chemical structures of the designed 
molecules, together with the binding modes of the lead compounds, were 
synthesized and shown to have high affinity for the target. Some of them 
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exhibit a K; in low-nanomolar range. Hence the method provided herein 
for modification of drugs for binding to calculated 3-D structures of a 
target protein resulted in redesigned drug candidates with enhanced 
affinity for the target. 

This approach has advantages over the traditional x-ray 
crystallography method, which include the following: 

(1) The binding modes are determined for a group of compounds 
instead of single compound; analysis of similarity and differences reveals 
rich information in binding mechanisms. 

(2) The predictive power of the free energy calculation is very 
desirable for redesign of compounds. 

(3) The correlation with the biochemical activities assures 
relevancy of the explored binding modes, while a structure given by x-ray 
crystallography may not necessarily be one related to the biological 
functions of the compound. 

A comparison of calculated relative free energy changes of binding 
AAA and experimental AAG converted from inhibition constants K t (all in 
kcal/mol) of the compounds (referenced by a code name) is presented in 
Table 2. 



Table 2 



Compound 


AAA 


AAG 


SBI-2030 


0 


0 


SBI-2002 


-0.97 


-1.25 


SBI-2005 


-0.72 


-1.14 


SBI-307 


-0.56 


-0.08 


SBI-2008 


-0.53 


-0.82 


SBI-2006 


-0.34 


-0.44 


SBI-306 


-0.07 


0.40 


SBI-2000 


0.29 


0.27 


SBI-2001 


0.72 


1.12 
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Compound 


AAA 


AAG 


SBI-304 


1 .55 


1 .45 


SBI-308 


1.70 


1.78 


SBI-305 


1.86 


1.67 


SBI-2048 


1.95 


1.94 



A comparison of calculated versus experimental binding free energy 
changes is given in FIG. 5. 

EXAMPLE 3 



HIV Protease Models for Drug Studies 

Antiviral therapy for AIDS has focused on the discovery and design of 
inhibitors for two main enzyme targets of the HIV-1 : reverse transcriptase 
(RT) and protease (PR). HIV RT is a heterodimer composed of p51 and 
p66 subunits. The p51 subunit is composed of the first 450 amino acids 
encoded by the RT gene and the p66 subunit is composed of all 560 
amino acids of the RT gene. RT is responsible for RNA-dependent DNA 
polymerization, RNaseH activity, and DNA-dependent DNA polymeriza- 
tion. 

HIV PR is a homodimer of two identical 99-amino acid chains. HIV 
PR is an aspartic proteinase that is responsible for the post-translational 
processing of the viral gag and gag-pol polyprotein gene products, which 
yields the structural proteins and enzymes of the viral particle (see, e.g., 
Erickson et a/,(1996) Anna. Rev. Pharmacol. Toxicol. 35:545-571, Bouras 
etal. (1999) J. Med. Chem. 42:957-962). Despite several promising new 
anti-HIV agents, the clinical emergence of drug-resistant variants of HIV 
limits the long-term effectiveness of these drugs. Genetic analysis of the 
resistant forms of HIV has identified a number of critical mutations in the 
RT and PR genes. Moreover, structural analysis of inhibitor-enzyme 
complexes and mutational modeling studies can lead to a better 
understanding of how these drug-resistant mutations exert their effects at 
the structural and functional levels. 
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HIV-PR inhibitor computational binding studies 

This example provides the results of a computational study on HIV 
PR. The 3-D protease structure was generated, docked with known viral 
inhibitors, and analyzed via free energy of binding studies described 
herein. A quantitative agreement between the calculated add 
experimental protease-drug binding energies was obtained. Moreover, a 
series of 3-D HIV PR models were analyzed to identify the invariant 
regions of the protease. These insights have implications for the design 
of new drugs and therapeutic strategies to combat AIDS drug resistance. 

Optimization of 3D structures 

Five PR inhibitors approved by the FDA for clinical use were used: 
saquinavir, nelfinavir, indinavir, amprenavir, and ritonavir (Figure 6). 
Initial 3-D structures for the wild-type HIV PR complexes with these FDA 
approved inhibitors were obtained from the Protein Data Bank and were 
then optimized using Monte Carlo (MC) simulations with an ECEPP/3 
force field as described in Example 1 . The energy function used in the 
MC simulations included: ECEPP/3 terms for energy in vacuo (van der 
Waals, H-bond, electrostatic and torsion potentials); distance dependent 
dielectrics with e 0 = 4.0; and surface free energy calculated using atomic 
solvation parameters ((Dudek et al. (1998) J. Computational Chem. 
75:548-573, Wang et al. (1995) J. Mol. Biol. 253:473-492). Standard 
ECEPP charges were used for the protein residues. Lys, Arg, Glu, and 
Asp residues were charged. Charged and protonated states of Asp 125 
(chain B) were considered as well. The inhibitors were docked into the 
active site of the protease, and the protein-drug complexes were 
energetically refined using the methods described in Example 1 . Partial 
charges for the inhibitors were calculated with the Gasteiger-Marsili 
method implemented in SYBYL 6.5 (Tripos Assoc., Inc.). Different 
protonation states were examined for indinavir and amprenavir, but the 
other inhibitors were assumed to be electroneutral. Water molecules 
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located within 7.0 A from a ligand atom in the X-ray structure were 
retained in the model complex during optimization. 
Calculation of binding energies 

For low energy conformations found after several iterative BMPC 
cycles, protein-drug binding energies were estimated using the equation: 

^bind ^compl ~ ^iigand " ^prot' 

where E comp , is the energy of the complex, E ligand & E prot are energies of the 
ligand and protein when separated, and E D is an adjustable constant. The 
binding energies of the protein and ligand were calculated using the 
following energy function: 

E — E e! -f- E vw + E hb 4- E s , 
where E e , is the exact-boundary electrostatic using e 0 = 8.0, E s is the 
side-chain entropy term, and E vw and E hb are the ECEPP/3 van der Waals 
and hydrogen-bonding terms. 

After the energies of the wild type PR-inhibitor complexes were 
calculated, mutation sites were introduced into the optimized X-ray 
structures or model complexes. The amino acid substitutions were 
followed by local optimization, using an ECEPP/3 force field, of protein 
side chains around the mutation sites via the energy minimization of 
substructures that included the ligand, water molecules within the sphere 
of radius 7.0 A around the ligand, and protease residues within the 
sphere of radius 3-5 A around the mutated residues. The energy of 
binding of the mutated complex was calculated based on the equation 
described herein. The difference in binding energy resulting from 
mutations (mut) of the wild-type (WT) protease were calculated using the 
following equation: 

AE bind (calculated) - E bind (WT) - E bind (mut). 
This change in binding energy was compared to data from experimental 
(exptl) studies (Gulnik eta/. (1995) Biochemistry 35:9282-9287, Klabe et 
al. (1998) Biochemistry 37:8735-8742, Pazhanisami eta/. (1996) J. Biol. 
Chem. 271:17979-17985, Jacobsen eta/. (1995) Virology 206:527-534, 
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Maschera eta/. (1996) J. Biol. Chem. 217:33231-33235) based on the 
equation: 

AE bind (exptl) = RTIn(K l mut/K i wt). 
Plots of AE bind (calculated) vs. AE btnd (exptl) were generated, and the results, 
summarized in Table 3, show a strong correlation between the calculated 
binding energies and the experimentally determined binding energies for 
the PR-inhibitor complexes. For example, the correlation coefficient R for 
PR-ritonavir and PR-amprenavir is 0.9, where R=1 denotes congruency 
between the computationally calculated and experimentally determined 
binding energy data. These correlation data validate the computational 
protocol and calculations described herein as a method for predicting 
protein-drug binding or protein-drug resistance (i.e. non-binding). The 
evaluation of changes in binding energy of protein-drug complexes upon 
protein sequence variations can be used as a possible descriptor and, 
thus, can be used to predict the efficacy of drugs on proteins resulting 
polymorphisms in genes. Moreover, the analysis of the free energy of 
binding in complexes between the protein models that are produced by 
the method set forth in this example and drugs that have been designed 
or modified is a good predictive tool for drug designers. 

TABLE 3 



Correlation between Experimental and Calculated Binding Energies 

for HIV Protease Inhibitors 



HIV 
PRInhibitor 


X-ray 
Complex ID 


No of exptl. 
data points 


Correlation 
coefficient R 


Correlation 
S.D., kcal/mol 


Saquinavir 


1 HXB 


18 


0.84 


0.68 


Indinavir 


1HSG 


17 


0.79 


0.80 


Ritonavir 


1 HXW 


12 


0.90 


0.72 


Amprenavir 


1HPV 


15 


0.90 


0.54 


Nelfinavir 


10HR 


Insufficient data 



Identification of structural invariant regions of HIV Protease 

Clinical effectiveness of HIV PR inhibitors is limited by the rapid 
emergence of drug-resistant mutations. Resistant PR variants first occur 
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by the mutation of amino acids close to or in and around the drug binding 
site, which are then accompanied by compensatory mutations of more 
distant amino acids. The identification of highly conserved, structural 
invariant regions of a PR would provide new potential targets and thus 
lead to the development of therapeutics having greater clinical efficacy 
than those drugs commonly employed to treat HIV. 

The protein sequences of HIV protease were obtained from 
GenBank and from the blood samples of patients using standard isolation 
and sequencing techniques well known in the arts. The protein 
sequences were modeled into 3-D structures using the computational 
protocol described in Example 1 . The protease sequences were aligned, 
and the frequency of mutation, regardless of type, was determined at 
each amino acid position and plotted in Figure 7, where the frequency of 
mutation in this set of HIV-1 Protease sequences varied from 0 to 40%. 
Sequence alignment also revealed how many different types of amino 
acids could be substituted in any specific residue, yielding the tolerance 
of each residue to substitutions of different types. The data showing the 
frequency of mutation of each residue out of PR sequences, the types of 
mutations, and the distance of the mutating residue from the active site 
(Asp 28) are shown in FIG. 8. This information, sequences obtained from 
10591 different genotypes, was used to identify invariant and/or highly 
conserved regions of PR and to map these regions to a 3-D structure for 
the purpose of identifying new potential regions on the protein as targets 
for therapeutic intervention. These invariant regions include, but are not 
limited to, residues 1-9, 25-29, 49-52, 78-81, and 94-99, where residue 
1 is an aliphatic amino acid, more preferably proline; residue 2 is a 
hydrophilic amino acid, more preferably glutamine; residue 3 is an 
aliphatic amino acid, more preferably isoleucine; residue 4 is a hydrophilic 
amino acid, more preferably threonine; residue 5 is a hydrophobic amino 
acid, more preferably leucine; residue 6 is an aromatic amino acid, more 
preferably tryptophan; residue 7 is a hydrophilic amino acid, more 
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preferably glutamine; residue 8 basic amino acid, more preferably 
arginine; residue 9 is an aliphatic amino acid, more preferably proline; 
residue 25 is a hydrophilic amino acid, more preferably aspartic acid; 
residue 26 is a hydrophilic amino acid, more preferably threonine; residue 
27 is an aliphatic amino acid, more preferably glycine; residue 28 is an 
aliphatic amino acid, more preferably alanine; residue 29 is an acidic 
amino acid, more preferably aspartic acid; residue 49 is an aliphatic amino 
acid, more preferably glycine; residue 50 is a hydrophobic amino acid, 
more preferably isoleucine; residue 51 is an aliphatic amino acid, more 
preferably glycine; residue 52 is an aliphatic amino acid, more preferably 
glycine; residue 78 is an aliphatic amino acid, more preferably glycine; 
residue 79 is an aliphatic amino acid, mpre preferably proline; residue 80 
is a hydrophilic amino acid, more preferably threonine; residue 81 is an 
aliphatic amino acid, more preferably proline; residue 94 is an aliphatic 
amino acid, more preferably glycine; residue 95 is a thio-containing amino 
acid, more preferably cysteine; residue 96 is hydrophilic amino acid, more 
preferably threonine; residue 97 is hydrophobic amino acid, more 
preferably leucine; residue 98 is hydrophilic amino acid, more preferably 
asparagine; and residue 99 is an aromatic amino acid, more preferably 
phenylalanine. These invariant regions can subsequently be used to 
assist in the design drugs or therapeutic agents which bind to the 
invariant regions and disrupt the activity of the protease with greater 
efficacy than drugs commonly used to treat HIV and where the free 
energy of binding between said drug or therapeutic agent and the 
structural invariant region is evaluated as described herein. The methods 
described in this example can also be applied to HIV RT and to any 
protein of interest that exhibits polymorphisms. 

EXAMPLE 4 

Computational Phenotyping of HiV-1 Protease and Reverse Transcriptase 

Computational or in siiico phenotyping is performed to assess 
phenotypic properties of a protein. This example demosntrates 
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application of this method to HIV-1 protease and reverse transcriptase to 
test whether the efficacy of various protease inhibitors for an HIV patient. 

To practice this method 3-D structures of HIV-1 protease and 
reverse transcriptase based upon the nucleic acid isolated from HIV from 
a patient are generated. Protein-drug binding analysis in silico in order to 
determine whether drug binding does (i.e. sensitivity) or does not (i.e. 
resistance) take place. 

Sequencing of HIV-1 Protease and Reverse Transcriptase is 
performed on HIV-1 cDNA following extraction, reverse transcription, and 
PCR amplification of viral RNA obtained from patient specimens, such as 
blood samples or other body fluid or tissue samples. Methods for the 
extraction, reverse transcription, and PCR amplification of viral RNA are 
well known in the art. For each sequence, a computer-generated 3-D 
structure of the protein is modeled and then docked with antiviral drugs in 
silico using methods described in Example 1 and elsewhere herein to 
analyze protein-drug interactions. Antiviral drugs that can be tested 
include, but are not limited to, saquinavir, indinavir, ritonavir, amprenavir, 
and nelfinavir for HIV protease; zidovudine, lamivudine, stavudine, 
zalcitabine, didanosine, abacavir, adefovir, delavirdine, nevirapine, and 
efavirenz for HIV reverse transcriptase; and any FDA-approved or non- 
FDA approved antiviral drug. From these protein-drug interaction studies, 
relative drug resistance or sensitivity is inferred by calculating and 
evaluating the free energy of binding in low energy conformations of 
complexes between the variant protease structure and docked antiviral 
drug or variant reverse transcriptase structure and docked antiviral drug, 
using the methods described in Examples 1 and 3 and elsewhere herein. 

The results of the computational phenotyping procedure can be 
presented as a patient report that states whether a drug or drugs are 
sensitive or resistant to the RT or PR obtained from the patient. Such a 
patient report assists physicians in selecting appropriate drugs for HIV 
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patients. It also is useful for the in vitro diagnostics industry in an 
adjunct test/service capacity to help optimize antiviral therapy. 

EXAMPLE 5 

HIV Protease and Reverse Transcriptase Databases 

Exemplary databases of the 3-D protein structures of polymorphic 
variants are described in this example. The HIV PR and RT databases are 
a comprehensive collection of 3-D polymorphic structural data along with 
related information, including nucleic acids encoding all or a portion of the 
protein. These data provide a means to understand differences in the 
interactions between a drug or drugs and the structural variations of the 
drug targets. 

This example describes the creation, interface for, and use of structural 
variant databases of HIV protease and reverse transcriptase polymorphic 
variants. 

Construction of databases 

To implement the RT or HIV database described herein, suitable 
computer for performing database server tasks includes a "Pentium" level 
CPU having at least 128 MB of memory, 30 GB of disk storage, and 256 
MB of disk swap space for files. A recommended configuration for better 
computer performance would include, for example, a "Pentium III" 
processor at 700 MHz or faster, memory of 256 MB or greater, disk 
storage space of 50 GB or more, and swap space of 500 MB or more. A 
suitable configuration for performing user tasks as described above 
includes a "Pentium" level CPU having 128 MB memory, disk space of 
240 MB with swap space of 256 MB, and an optional display circuit card 
supporting OpenGL and having 4 MB of memory. A recommended 
configuration for better performance would include, for example, a 
"Pentium 111" processor at 500 MHz or faster, memory of 256 MB or 
greater, disk space of 500 MB or more, swap space of 500 MB or more, 
and an optional display card having 8 MB of memory or more, supporting 
resolution of 1024 x 768. 
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Preferably, the software used in the computing system described 
above includes, for the server machine, operating system software such 
as "Windows NT Server 4.0" from Microsoft Corporation, with Service 
Pack 5, Version 1280 (10 June 1999) or more recent, with database 
management server software such as "Oracle Server Standard Edition 
8.1" from Oracle Corporation, or better. The software used in a preferred 
embodiment of the user machine includes operating system software such 
as "Windows NT Workstation 4.0" from Microsoft Corporation, with 
Service Pack 5, version 1280 (10 June 1999) or more recent, as well as 
"Oracle Client Standard Edition Version 8.1" or better. The client 
machine will also be compliant with the "Java" programming language 
(Java Runtime Environment 1.2.2). As will be known to those skilled in 
the art, other configurations may be suitable, depending on the 
applications being used and the computer performance desired. 

Database Interface 

The database interface was a Java-based interface with useful 
features. The database is interfaced to a molecular graphics package that 
includes 3-D visualization, including wire-frame representations; 
secondary structure ribbons; and solid surfaces, and structure analysis 
tools. The database also provides an interface to access all of the 
collected files from the same 3-D structure. The database interface also 
provides access to other databases, such as databases of chemical 
structures and public domain databases such as GenBank and the Protein 
Data Bank. The OpenGL and C+ + module has real-time interaction with 
the sequence display and sequence analysis modules, such that 
highlighting residues in one display results in highlighting those same 
residues in other displays. 

The relational database containing the protein information may be 
structured according to relational objects to facilitate the analysis and 
computation processes described in the preceding examples. FIG. 10 is a 
graphical representation of the database objects for the system described 
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herein. The database is organized by classes, each of which is 
characterized by data attributes and subclasses for the proteins. 

FIG. 10 shows that the database design includes classes 
comprising Variant and related classes of Sample, Residue, Model, 
ResistanceEntry, and Protein. Other classes include Conformation, 
Residue_Conformation, Atom, Drug, Family, and Subfamily. These 
classes store attribute data values and specify class parameters and 
behaviors to provide the functionality described herein. 

For example, FIG. 10 shows that the Variant class stores 
parameters to specify a variant, including subclasses that specify a 
VariantJD, SampleJD, ProteinJD, Name, and Sequence, where 
VariantJD is the identification number of the variant; SampleJD is the 
identification number of the sample from which HIV PR and RT were 
obtained; ProteinJD is the identification number of the protein i.e. PR or 
RT; Name is the name of the variant distinguishing it from other variants 
encoded by the same DNA due to ambiguities in the nucleic acid 
sequence; and Sequence is the nucleotide or amino acid sequence. 
Similarly, FIG. 10 shows that the Sample class includes subclasses 
relating to a specific sample and which specify SampleJD, SampleJDate, 
Sex, AmbiguityNumber, Distance, SequenceLength, Sequence, Clade, 
and Region, where SampleJD is as defined herein; Sample_Date is the 
date the sample was obtained; Sex is the gender of the sample donor; 
Ambiguity Number is fraction of ambiguous nucleotide positions; 
Distance is a normalized number the variation of an amino acid from the 
master clade; Sequence Length is the length of the sequence; Sequence 
is as defined herein; Clade is the master sequence; and Region is the 
geographic location from which the sample was obtained. The Model 
class includes subclasses comprising ModellD, ModeMMame, VariantJD, 
and DrugJD, where Model ID is the identification number of the 3-D 
protein model; ModelJMame is the name of the 3-D protein model; 
VariantJD is as defined herein; and DrugJD is the identification number 
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of the drug i.e. antiviral drug. The atom class includes the subclasses 
comprising AtomJMame, Residue_Conformation_ID, X_Coordinate, 
YCoordinate, and ZCoordinate, where Atom_Name is the name of atom 
in the 3-D protein structure; Residue Conformation ID is the identification 
number of the amino acid conformation in a 3-D structure; and 
XCoordinate, YCoordinate, and ZCoordinate are the coordinates of the 
3-D protein structure. The conformation class includes the subclasses 
comprising ConformationJD, ModelJD, and Refinement_Level, where 
ConformationJD is the identification number of a conformation of a 3-D 
structure; ModelJD is as defined herein, and RefinementLevel is the 
number of times the conformation was refined energetically. The drug 
class includes the subclasses comprising DrugJD, Profile, Symbol, 
Namel, Name2, Company, and URL, where DrugJD is as defined herein; 
Symbol is the FDA symbol for the drug; Namel is the name of the drug, 
Name2 is an alternative name of the drug; Company is the company that 
makes the drug; and URL is the website address of the company that 
makes the drug. The residue_conformation class includes the subclasses 
comprising Residue Conformation ID, Conformation ID, and Residue ID, 
where Residue_Conformation_ID is as defined herein; ConformationJD is 
as defined herein; and ResidueJD is the identification number of the 
amino acid. The Resistance Entry class includes the subclasses 
comprising ResistanceJEntry ID, Profile, ProteinID, ResiduaMMumber, 
Amino Acid, Weight, and Maximum_Weight, where Resistance EntryJD 
is; ProteinJD is as defined herein, Amino Acid is the amino acid. The 
Family class includes the subclasses comprising FamilyJD and 
Family Name, where FamilyJD is the identification number of the protein 
family and Family Name is the name of the protein family. The SubFamily 
class includes the subclasses comprising SubFamilyJD, SubFamilyJMame, 
and FamilyJD, where SubFamilyJD is the identification number of the 
protein subfamily, SubFamilyJMame is the name of the protein subfamily, 
and FamilyJD is as defined herein. The Protein class includes the 
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subclasses comprising ProteinJD, Proteinjslame, Species, 
MultipleDomain, MultipleChain, and WildJType, where ProteinJD is as 
defined herein, Protein_Name is the name of the protein i.e. RT or PR; 
Species is the species of the source of the protein i.e. humans; 
MultipleDomain is the domain of the protein i.e p66 or p51 in the case of 
RT; Multiple_Chain is the a or b chain in the dimers of RT and PR; and 
WildJType is the wild-type protein sequence for RT and PR. The residue 
class includes the subclasses comprising ResidueJD, VariantJD, Chain, 
ResidueNumber, lnsertion_Code, and Residue_Code, where ResidueJD is 
the identification number of the amino acid, VariantJD is as defined 
herein, Chain, ResidueJMumber is the numbering of an amino acid in a 
protein sequence, InsertionCode is the identification number if different 
insertions occur in the amino acid sequence, and Residue_Code is the 
single letter or 3-letter code of an amino acid. Those skilled in the art will 
understand the database design exemplified in FIG. 10. It should be 
understood that other classes or parameters may be included, as selected 
by those skilled in the art, for the desired database design. 
Database Content 

The databases contain information on the variants of HIV PR and 
RT present in patient populations. The master amino acid sequence, 
nucleic acid sequence, and 3-D structure are obtained from GenBank; an 
exemplary master sequence is set forth in SEQ ID No. 1 18. Nucleotide 
sequences exhibiting polymorphisms and the corresponding structural 
variant protein sequences are determined by isolating nucleic from viruses 
and viral nucleic acid obtained from the blood samples of patients 
throughout the US, as well as from other countries, using sequencing 
methods well known in the art. The sequences were inputted into the RT 
and PR databases. Exemplary of the nucleotide sequences and the 
encoded amino acids for HIV RT and PR in this data base are set forth in 
SEQ ID NOS. 3 to 117, where r is g or a; y is t/u or c; m is a or c; k is g 
or t/u; s is g or c; w is a or t/u; b is g or c or t/u; d is a or g or t/u; h is a 



-82- 



24737-1 906C 



or c or t/u; v is a or g or c; and n is a or g or c or t/u or unknown or 
other. The amino acid sequences of the wild type and structural variants 
are used to create 3-D protein structures which are deposited into the 
databases. 

1 . 3-D Protein Models 

The structure of the wild-type or master sequence model of PR and 
RT were obtained from the crystal structures found in PDB. The initial 
structure was refined energetically using BPMC with an ECEPP force field 
as described in Example 1 . The quality of the model was assessed by 
calculating Normalized Residue Energies (NREs), where models with e av > 
1 .5 require further energetic refinement; and models with e av < 1 .5 were 
deposited into the database as described herein. The 3-D protein 
structures of the variant sequences were generated by comparing these 
structures to the master sequence (see, e.g., SEQ ID No. 1 18; i.e., 
homology modeling) and energetically refining the models ab initio, using 
the same force field and BPMC procedure as the master sequence and 
applying the same quality control standard as described herein. Figure 1 1 
is a tabulation of the 3-D coordinates of an exemplary HIV PR entry in a 
database that includes 3-D structures. For US purposes and where 
permitted, Tables 4 and 5 are provided electronically on CD ROM. These 
Tables house the coordinates that represent the 3-D protein structures of 
proteins encoded by the nucleic acids set forth in SEQ. ID. NOS. 3-1 17. 
It will be noted that these sequences encode a full length PR and about 
200 nucleotides the p51 subunit, which is the subunit of interest herein. 
To construct the full-length 3-D structure, the 3-D structure of each 
encoded portion of the p51 subunit was generated and then combined 
with the structure of the master sequence to produce a full-length 
structure. 

These 3-D structures in the database can be selected and exported 
into computational docking programs for analyzing protein-drug 
interactions on known drugs, new drugs or modified drugs. The database 
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can be mined to find protein models that correspond to patients with a 
particular genetic polymorphism, patients with the most commonly 
occurring polymorphism, to a relevant patient subpopulation {e.g., gender, 
age, race, or other characteristic), to patients receiving a specific 
treatment regimen, to patients exhibiting a particular clinical response, to 
structural invariants, or to other relevant criteria. 

Drugs can be docked into the active sites of PR and RT and subsequently 
energetically refined using an ECEPP force field and BPMC as described in 
Example 1 . The quality control is that the protein-drug complex 
represents a low energy conformation, which may take several iterative 
BMPC cycles. Then, the binding energies of the protein-drug complexes 
can be estimated using the methods of Example 1 . Drug designers can 
modify the structures of drugs 

or design new drugs, using methods well known in the arts, to maximize 
the drug binding to the models generated by this database. 
2. Other Data 

Each PR or RT nucleotide sequence in the database has associated 
with it an identification number, the nucleotide sequence length, the 
translated amino acid sequence (or sequences in cases of ambiguous 
nucleotide positions), a 3-D structure for each amino acid sequence (from 
which a number of structurally related values are calculated), the 
genotyping date, the gender of the patient, the geographical location from 
which the sample was sent, the clade of the sequence, the fraction of 
ambiguous nucleotide positions, drug information, and other clinical 
information. 

Database Usage 

A query menu allows the user to retrieve data based on the various 
fields: sample ID, residue number (with or without specific amino acid 
mutation), date gender, geographic location, distance from the master 
sequence, and other useful queries. The set of sequences that satisfies 
the user's query are brought up in a sequence display module, which 
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have variations from the master sequence indicated initially, although the 
sequences can be highlighted according to predicted resistance. This 
subset of sequences can be subjected to further analyses. For example, a 
histogram summarizing the number of mutations at each position in the 
subset can be generated. The 3-D structures for any of the variants in 
the database can be displayed and analyzed in the structure visualization 
module, allowing the user to compare the similarities and differences 
between 3-D structures by superimposing the 3-D structures. The user 
and also export these structures into programs for protein-binding studies 
as described herein. Thus, by mining the databases, a user will access 3- 
D structures and clinical and sample information that can be used in and 
correlated with protein-drug binding studies of HIV PR and RT. 
Database Applications 

The HIV PR and RT databases have many applications. The 
applications include, but are not limited to, any application and method 
provided herein, such as databases that assist in de novo drug design and 
drug binding calculations. In particular, the database can be used in the 
design of 2nd and 3rd generation drugs to combat potential resistance to 
HIV therapy, and it can be used in the design of drugs that will impact a 
broad spectrum of the infected population. The databases provide the 
ability to design drugs that focus on the most highly conserved regions of 
a drug target and drugs that will avoid resistance to mutation. The 
database could be used to rank drug candidates by likely efficacy within a 
given subpopulation of patients (e.g. age, race, gender) in pre-clinical 
trials and to predict the most effective drug regimen to give a patient, and 
for designing clinical trials. 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited only by the scope of the appended 
claims. 
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CLAIMS 

1 . A computer-based method of drug design based on genetic 
polymorphisms, comprising: 

obtaining more than one amino acid sequence of target proteins 
that are the product of a gene exhibiting genetic polymorphisms, wherein 
the sequences represent different genetic polymorphisms; 

generating 3-dimenstonal (3-D) protein structural variant models 
from the sequences; and 

based upon the structures of the 3-D models, designing drug 
candidates, modifying existing drugs, identifying potential drug 
candidates or identifying modifications of existing drugs based on 
predicted intermolecular interactions of the drug candidates or modified 
drugs with the structural variants. 

2. The method of claim 1, wherein the structure-based drug 
design method comprises: 

computationally docking the drug candidate or modified drug 
molecules with the target protein structural variant models; 
energetically refining the docked complexes; 

determining the binding interactions between the drug candidate or 
modified drug molecules and the structural variants; and 

designing and identifying drugs or modifications to existing drugs 
based on the binding interactions. 

3. The method of claim 2 wherein the binding interactions are 
determined by: 

calculating the free energy of binding between the protein 
structural variant model and the docked molecule; and 

decomposing the total free energy of binding based on the 
interacting residues in the protein active site. 

4. The method of claim 1 wherein: 

after the protein structural variant models derived from a particular 
genetic polymorphism are generated, selected model structures are 
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analyzed to determine common structural features that are conserved 
throughout the selected models, wherein 

the conserved structural features are used as a basis for structure- 
based drug design studies. 

5. The method of claim 4, wherein the conserved structural 
features are stretches of non-contiguous residues, wherein each stretch 
contains at least two amino acids. 

6. The method of claim 5, wherein the protein is human 
immunodeficiency virus protease. 

7. The method of claim 6, wherein the conserved residues 
comprise residues comprise residues 1-9, 25-29, 49-52, 78-81 and 94- 
99; and wherein: 

residue 1 is an aliphatic amino acid; residue 2 is a hydrophilic 
amino acid; residue 3 is an aliphatic amino acid; residue 4 is a hydrophilic 
amino acid; residue 5 is a hydrophobic amino acid; residue 6 is an 
aromatic amino acid; residue 7 is a hydrophilic amino acid; residue 8 is a 
basic amino acid; residue 9 is an aliphatic amino acid; residue 25 is an 
acidic amino acid; residue 26 is a hydrophobic amino acid; residue 27 is 
an aliphatic amino acid; residue 28 is an aliphatic amino acid; residue 29 
is an acidic amino acid; residue 49 is an aliphatic amino acid; residue 50 
is a hydrophobic amino acid; residue 51 is an aliphatic amino acid; residue 
52 is an aliphatic amino acid; residue 78 is an aliphatic amino acid; 
residue 79 is an aliphatic amino acid; residue 80 is a hydrophilic amino 
acid; residue 81 is an aliphatic amino acid; residue 94 is an aliphatic 
amino acid; residue 95 is a thio-containing amino acid; residue 96 is a 
hydrophilic amino acid; residue 97 is hydrophobic amino acid; residue 98 
is hydrophilic amino acid; and residue 99 is an aromatic amino acid. 

8. The method of claim 6, wherein the conserved residues 
comprise residues comprise residues 1-9, 25-29, 49-52, 78-81 and 94- 
99; and wherein: 
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residue 1 is proline; residue 2 is glutamine; residue 3 is isoleucine; residue 
4 is threonine; residue 5 is leucine; residue 6 is tryptophan; residue 7 is 
glutamine; residue 8 is arginine; residue 9 is proline; residue 25 is aspartic 
acid; residue 26 is threonine; residue 27 is glycine; residue 28 is alanine; 
residue 29 is aspartic acid; residue 49 is glycine; residue 50 is isoleucine; 
residue 51 is glycine; residue 52 is glycine; residue 78 is glycine; residue 
79 is proline; residue 80 is threonine; residue 81 is proline; residue 94 is 
glycine; residue 95 is cysteine; residue 96 is threonine; residue 97 is 
leucine; residue 98 is asparagine; and residue 99 is phenylalanine. 

9. The method of claim 6, wherein the HIV protease has the 
sequence of amino acids set forth in any of SEQ ID Nos. 3-74 and 77- 
117. 

10. The method of claim 9, wherein the residues comprise residues 
1-9, 25-29, 49-52, 78-81 and 94-99. 

10. The method of claim 1, wherein the selected model 
structures represent the structural variants resulting from the most 
commonly occurring genetic polymorphisms. 

1 1 The method of claim 1, wherein the selected model 
structures represent the structural variants resulting from genetic 
polymorphisms found in a selected patient subpopulation. 

12. The method of claim 1 wherein the structural variant models 
are stored in a relational database, comprising: 

3-D molecular coordinates for the structural variants; 

a molecular graphics interface for 3-D molecular structure 
visualization; computer functionality for protein sequence and 

structural analyses; and 

database searching tools. 

13. The method of claim 12, wherein the database further 
comprises one or more of observed clinical data associated with the 
genetic polymorphisms, subject medical history and subject history. 

14. The method of claim 1, wherein: 
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after generating the 3-D protein structural variant models, 
the method comprises: 

computationally docking drug molecules with the target 
protein models; and 

energetically refining the docked complexes; and 
wherein the candidate drugs are specific for a protein with a 
selected polymorphism or specifically interact with all proteins exhibiting a 
polymorphism. 

15. The method of claim 14, wherein the structure-based drug 
design method comprises: 

computationally docking drug or potential new drug candidate 
molecules with the target protein structural variant models; 
energetically refining the docked complexes; 

determining the binding interactions between the drug or potential 
new drug candidate molecules and the structural variants; and 

designing potential new drugs or modifications to existing drugs 
based on the binding interactions. 

16. The method of claim 15, wherein the binding interactions are 
determined by: 

calculating the free energy of binding between the protein 
structural variant model and the docked molecule; and 

decomposing the total free energy of binding based on the 
interacting residues in the protein active site. 

17. The method of claim 14, wherein: 

after the protein structural variant models derived from a particular 
genetic polymorphism are generated, selected model structures are 
analyzed to determine common structural features that are conserved 
throughout the selected models; and 

the conserved structural features are used as a basis for structure- 
based drug design studies. 
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18. The method of claim 17, wherein the selected model 
structures represent the structural variants resulting from the most 
commonly occurring genetic polymorphisms. 

19. The method of claim 17, wherein the selected model 
structures represent the structural variants resulting from genetic 
polymorphisms found in a specific patient subpopulation. 

20. The method of claim 12, wherein the selected model 
structures represent structural variants derived from patients the receive a 
specific treatment regimen. 

21 . The method of claim 12, wherein the selected model 
structures represent structural variants derived from patients that exhibit 
a particular clinical responses to a given drug. 

22. The method of claim 12, wherein the selected model 
structures represent structural variants derived based on the duration of a 
particular drug treatment. 

23. The method of claim 12, wherein the structural variant 
models are stored in a relational database, comprising: 

3-D molecular coordinates for the structural variants; 
a molecular graphics interface for 3-D molecular structure 
visualization; and 

functionality for protein sequence and structural analysis; and 
database searching tools. 

24. The method of claim 12, wherein the database further 
comprises observed clinical data associated with the genetic 
polymorphisms, subject medical history and subject history. 

25. A computer-based method of selecting drug therapies for 
patients based on genetic polymorphisms, comprising: 

obtaining amino acid sequences of a target protein that is the 
product of a gene exhibiting genetic polymorphisms, wherein the 
sequences represent different genetic polymorphisms; 
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generating 3-D protein structural variant models from the 
sequences; 

computationally docking drug molecules with the target protein 
models; 

energetically refining the docked complexes; 

determining the binding interactions between the drug or potential 
new drug candidate molecules and the models; and 

selecting drug therapies based on the drug or drugs that have the 
most favorable binding interactions with the structural variant models. 

26. The method of claim 25, wherein the binding interactions are 
determined by: 

calculating the free energy of binding between the protein 
structural variant and the docked drug molecule; and 

decomposing the total free energy of binding based on the 
interacting residues in the protein active site. 

27. The method of claim 1, further after generating the 3-D 
structural variant models, exporting some or all of them models into a 
program that computationally docks the models with test compounds to 
assess intermolecular interactions. 

28. A computer-based method for predicting clinical responses in 
patients based on genetic polymorphisms, comprising: 

obtaining one or more amino acid sequences for a target protein 
that is the product of a gene exhibiting genetic polymorphisms; 

generating 3-D protein structural variant models from the 
sequences; 

building a relational database of protein structural variants derived 
based on genetic polymorphisms and observed clinical data associated 
with particular polymorphisms exhibited in the patients, wherein the 
database comprises: 

3-D molecular coordinates for the structural variant models; 
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a molecular graphics interface for 3-D molecular structure 
visualization; 

computer functionality for protein sequence and structural 
analysis; 

database searching tools; and 

observed clinical data associated with the genetic 
polymorphisms, subject medical history and subject history 
associated with the genetic polymorphisms; 

obtaining a target protein structural variant based on the same gene 
associated with a polymorphism in a patient; 

generating a 3-D protein model based on the subject's gene 
sequence; 

screening/comparing the 3-D model derived from the subject to the 
structures contained in the database by: 

identifying structures in the database that are similar to the 

model derived from the subject; and 

predicting a clinical outcome for the patient based on the 

clinical data associated with the identified structures. 

29. A computer-based method for designing therapeutic agents 
that are active against biological targets that have become drug resistant 
due to genetic mutations, comprising: 

obtaining a first 3-D protein structural variant model of a target 
protein against which a given drug has biological activity; 

generating a second 3-D protein structural variant model of the 
target in which genetic mutations have occurred and against which the 
same drug is no longer biologically active; 

comparing the structures of the first and second model to identify 
structural differences; and 

performing structure-based drug design calculations in order to 
identify new drugs or modifications to the existing drug to bring about 
biological activity against the second model. 
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30. A computer-based method for identifying compensatory 
mutations in a target protein, comprising: 

obtaining the amino acid sequence of a target protein containing 
multiple amino acid mutations that is expressed in a patient, wherein the 
structure of a form of the target protein that responds to a particular 
drug, including the active site, has been structurally characterized; 

generating a 3-D structural model of the mutated protein; 

comparing the structure of the mutated protein with the form of the 
protein that responds to the drug to identify structural differences and/or 
similarities arising from the mutations; 

comparing the biological activities of the drug against both the 
mutated protein and the form of the protein that responds to the drug to 
determine the effects of the mutations on drug response; and 

identifying the mutations in the protein that affect biological 
activity based on the comparisons. 

31. A method for creating a 3-D structural polymorphism 
relational database, comprising: 

obtaining one or more amino acid sequences of a target protein 
that is the product of a gene exhibiting a genetic polymorphism, wherein 
sequences represent different genetic polymorphisms; 

generating 3-D protein structural variant models from the 
sequences; 

energetically refining the models; 

evaluating the quality of the models; 

optionally obtaining associated clinical properties or data; and 
inputting the model and any associated properties and/or data into 
a relational database. 

32. The method of claim 31, wherein after energetically refining 
the models, the models are further refined. 
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33. The method of claim 31, wherein the database comprises 
amino sequences of two or more polymorphic variants. 

34. The method of claim 31, wherein the database comprises 
amino sequences of ten or more polymorphic variants . 

35. The method of claim 31, wherein the database comprises 
amino sequences of about 100 or more polymorphic variants . 

36. The method of claim 31, wherein the database comprises 
amino sequences of about 1000 or more polymorphic variants . 

37. The method of claim 31 , wherein the database comprises 
amino sequences of more than 8000 polymorphic variants. 

38. A database created by the method of claim 31 . 

39. The database of claim 38, comprising variant 3-dimensional 
structures of a selected target. 

40. The database of claim 38 that comprises structures of 
proteases or polymerases. 

41 . The database of claim 38, wherein the proteases are viral 
proteases or polymerases. 

42. The database of claim 38, wherein the viral proteases are 
human immunodeficiency virus proteases and the polymerase is a viral 
reverse transcriptase. 

43. The method of claim 31, wherein quality is assessed by 
computing the normalized residue energies such that if e av is > 1.5 a 
model is further refined until e av is < 1 .5; if e av is < 1 .5 a model is 
deposited into the database. 

44. The method of claim 1, wherein the target is an enzyme. 

45. The method of claim 44, wherein the enzyme is a protease 
or polymerase. 

46. The method of claim 45, wherein the polymerase is a reverse 
transcriptase. 

47. The method of claim 44, wherein the target is a protein 
expressed by an infectious agent. 
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48. The method of claim 44, wherein the target is enzyme 
expressed by a an infectious agent. 

49. The method of claim 48, wherein the agent is a human 
immunodeficiency virus (HIV). 

50. A computer system, comprising a database containing data 
representative of the three dimensional structure of polymorphic variants 
of a drug target. 

51 . The system of claim 50, wherein the target is a cell surface 
receptor or an enzyme. 

52. The system of claim 50, wherein the enzyme is a protease or 
a polymerase. 

53. A database, comprising: 

sequences of nucleotides encoding a protein or portions thereof, 
wherein proteins comprise polymorphic variants; and the portions encode 
a domain of the protein that comprises a site in the protein that binds to a 
drug candidates; and 

the coordinates of 3-dimensional (3-D) structures of the encoded 
proteins or portions thereof. 

54. The database of claim 53 that is a relational database. 

55. The database of claim 53 that comprises at least 2 
polymorphic variants and the corresponding 3-D structures. 

56. The database of claim 55 that comprises at more than 10, 
more than 100, more than 1000, more than 8000, or more than 10,000 
polymorphic variants and the corresponding 3-D structures. 

57. The database of claim 53, wherein the protein is a receptor 
or enzyme from a eukaryotic or prokaryotic organism. 

58. The database of claim 53, wherein the organism is a 
pathogen or a mammal. 

59. The database of claim 53, wherein the organism is a 
pathogen is a virus or bacterium and the mammal is a human. 
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60. The database of claim 53, wherein the protein is a protease 
or a reverse transcriptase. 

61 . A database, comprising the sequences of nucleotides set 
forth in SEQ ID Nos. 3-1 17 that encode HIV protease or the portion of 
HIV reverse transcriptase set forth in each SEQ ID. 

62. The database of claim 53, further comprising 3-D structural 
coordinates for a protein or portion thereof comprising sequences of 
amino acids encoded by each of SEQ ID Nos. 3-1 1 7. 

63. The database of claim 54, wherein the protein is HIV 
protease. 

64. The database of claim 54, wherein the protein is HIV reverse 
transcriptase. 

65. The method of claim 1, wherein the target protein is a 
eukaryotic or prokaryotic protein. 

66. The method of claim 1 , wherein the target protein is an 
animal protein, a plant protein or a protein from a pathogen. 
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ABSTRACT OF THE DISCLOSURE 

Provided herein are computer-based methods for generating and 
using three-dimensional (3-D) structural models of target molecules and 
databases containing the models. The targets can be protein structural 
variants derived from genes containing polymorphisms. The models are 
generated using molecular modeling techniques and are used in structure- 
based drug design studies for identifying drugs that bind to particular 
structural variants in structure-based drug design studies, for designing 
allele-specific drugs and population-specific drugs and for predicting 
clinical responses in patients. Computer-based methods for predicting 
drug resistance or sensitivity via computational phenotyping are also 
provided. Databases containing protein structural variant models are also 
provided. 
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<400> 1 

Asp Xaa Leu lie Xaa Cys 
1 5 

<210> 2 
<211> 6 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Modified Hepatitis C Virus NS3 Protease Inhibitor 
Peptide 



<221> ACETYLATION 
<222> 1 



<221> MOD_RES 
<222> 5 

<223> beta-cyclohexylalanine 
<300> 

<301> Ingallinella, P., Altamura, S., Bianchi, E., Talia 

<302> Potent Peptide Inhibitors Of Human Hepatitis C Vir 

<3 03> Biochemistry 

<304> 37 

<305> 25 

<306> 8906-8914 

<307> 1998-06-23 



<400> 2 

Asp Glu Leu lie Xaa Cys 
1 5 



<210> 3 
<211> 1045 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 



<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> Protease 



<221> CDS 

<222> (298) . . . (1045) 

<223> Portion of Reverse Transcriptase 
<400> 3 

cct cag ate act ctt tgg caa cga ccc cty gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys lie Gly 
15 10 15 

ggc caa eta aaa gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg agt tta cca ggg aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Ser Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat caa ata etc ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Leu He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggc aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ttg ttg act cag att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttg ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Leu Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 



-2- 



290 295 300 

aga gga cat eta tta aag tgg gga ttt acc aca cca gac aaa aaa cat 
Arg Gly His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 



320 



624 



100 105 HO 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
H5 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gag aat cca tac aat act cca ata ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga aca caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cac ccc gca ggg tta aaa cag aaa aaa tea gta aca ata ctg 
He Pro His Pro Ala Gly Leu Lys Gin Lys Lys Ser Val Thr He Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa ggc ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Gly Phe Arcr 
210 215 220 

aag tat act gca ttt acc ata cct agt aga aat aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Arg Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aac gtg etc cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttt caa agt age atg aca aga aty tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Arg Xaa Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Glu He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gga cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Ala Lys He Glu Glu Leu 

-) Q A ~. ^ r- _ . 



960 



cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata aag ttg cca gaa aaa g 1045 
Lys Trp Thr Val Gin Pro He Lys Leu Pro Glu Lys 
340 345 



<210> 4 
<211> 1046 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 



-3- 



<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1046) 

<223> Portion of HIV Reverse Transcriptase 
<400> 4 

cct cag ate act ctt tgg caa cga ccc ctt gtc aca ata aag ata gga 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

gtt gaa gaa atg aat ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Val Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gag caa ata gec gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Glu Gin lie Ala Val 
50 55 60 

gaa aty tgt gga cat aga get atg ggt aca gta tta gta gga cct aca 24 0 

Glu Xaa Cys Gly His Arg Ala Met Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 28 8 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa ate tgt aca gaa ttg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aag aac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gag gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cca gca ggg tta aaa aag aat aaa tea ata aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser lie Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta tgt gaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Cys Glu Asp Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt gta aac aat gag act cca ggg 72 0 



-4- 



Lys Tyr Thr Ala Phe Thr lie Pro Ser Val Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga ttc acc 76 8 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Phe Thr 
245 250 255 

age ata ttc caa tgt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ser He Phe Gin Cys Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gag ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Glu He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Ala Lys He Glu Glu Leu 
290 295 300 

aga caa tat ctg tgg aag tgg gga ttt tgc aca cca gaa caa aar cat 960 
Arg Gin Tyr Leu Trp Lys Trp Gly Phe Cys Thr Pro Glu Gin Lys His 
305 310 315 320 

cag aaa gaa cct cct ttc ctt tgg atg ggt tat gaa etc cat ccc gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta caa cct ata gtg ctg cca gac aaa ga 1046 
Lys Trp Thr Val Gin Pro He Val Leu Pro Asp Lys 
340 345 



<210> 5 
<211> 1104 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) ... (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1104) 

<223> Portion of HIV Reverse Transcriptase 
<400> 5 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag rta ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro He Val Thr He Lys Xaa Gly 
15 10 15 

ggg caa eta agg gaa get eta tta gat aca gga gca gat gat aca ata 96 
Gly Gin Leu Arg Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr He 
20 25 30 

ata gaa gac ata act ttg cca gga aga tgg aca cca aaa atg ata ggg 144 
He Glu Asp He Thr Leu Pro Gly Arg Trp Thr Pro Lys Met He Gly 
35 40 45 

gga att gga ggt ttt gtc aaa gta aga cag tat gat cag ata ccc ata 192 
Gly He Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa ate tgc gga cat aaa rtt ata agt aca gta ttg gta gga cct aca 240 
Glu He Cys Gly His Lys Xaa He Ser Thr Val Leu Val Gly Pro Thr 
65 70 75 80 
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cca ata aac 
Pro lie Asn 



tta aat ttt 
Leu Asn Phe 



cca gga atg 
Pro Gly Met 
115 

aaa ata aag 
Lys lie Lys 
130 

aaa att tea 
Lys lie Ser 
145 

gec ata aag 
Ala lie Lys 



aga gaa ctt 
Arg Glu Leu 



ata cca cat 
lie Pro His 
195 

gat gtg ggt 
Asp Val Gly 
210 

aag tat act 
Lys Tyr Thr 
225 

att aga tat 
lie Arg Tyr 



gca ata ttc 
Ala lie Phe 



cag aat cca 
Gin Asn Pro 
275 

gga tct gac 
Gly Ser Asp 
290 

aga caa tat 
Arg Gin Tyr 
305 

cag aca gaa 
Gin Thr Glu 



aaa tgg aca 
Lys Trp Thr 



ata gtt gga 
He Val Gly 
85 

ccc att agt 
Pro He Ser 
100 

gat ggc cca 
Asp Gly Pro 



gca tta gta 
Ala Leu Val 



aaa att ggg 
Lys He Gly 
150 

aaa aag aac 
Lys Lys Asn 
165 

aac aaa aga 
Asn Lys Arg 
180 

ccc gca ggg 
Pro Ala Gly 



gat gca tat 
Asp Ala Tyr 



gca ttt acc 
Ala Phe Thr 
230 

cag tac aat 
Gin Tyr Asn 
245 

caa agt age 
Gin Ser Ser 
260 

gaa atg gtc 
Glu Met Val 



tta gaa ata 
Leu Glu He 



ctg tgg aag 
Leu Trp Lys 
310 

cct cca ttc 
Pro Pro Phe 
325 

gta cag cct 
Val Gin Pro 
340 



aga aat ctg 
Arg Asn Leu 



cct att gaa 
Pro He Glu 
105 

aaa gtt aaa 
Lys Val Lys 
120 

gaa att tgt 
Glu He Cys 
135 

cct gaa aat 
Pro Glu Asn 



agt act aaa 
Ser Thr Lys 



act caa gac 
Thr Gin Asp 
185 

tta aag aag 
Leu Lys Lys 
200 

ttt tea att 
Phe Ser He 
215 

ata cct agt 
He Pro Ser 



gtg ctt cca 
Val Leu Pro 



atg aca aaa 
Met Thr Lys 
265 

ate tat caa 
He Tyr Gin 
280 

gag cag cat 
Glu Gin His 
295 

tgg gga ttt 
Trp Gly Phe 



ctt tgg atg 
Leu Trp Met 



ata gtg ctg 
He Val Leu 
345 



atg act cag 
Met Thr Gin 
90 

act gta cca 
Thr Val Pro 



caa tgg cca 
Gin Trp Pro 



mca gaa ctg 
Xaa Glu Leu 
140 

ccg tac aat 
Pro Tyr Asn 
155 

tgg aga aaa 
Trp Arg Lys 
170 

ttc tgg gaa 
Phe Trp Glu 



aaa aaa tea 
Lys Lys Ser 



ccc tta tgt 
Pro Leu Cys 
220 

ata aac aat 
He Asn Asn 
235 

cag gga tgg 
Gin Gly Trp 
250 

ate tta gag 
He Leu Glu 



tac gtg gat 
Tyr Val Asp 



aga aca aaa 
Arg Thr Lys 
300 

tac aca cca 
Tyr Thr Pro 
315 

ggt tat gaa 
Gly Tyr Glu 
330 

cca gaa aaa 
Pro Glu Lys 



att ggt tgc 
He Gly Cys 
95 

gtc aaa tta 
Val Lys Leu 
110 

ttg aca gaa 
Leu Thr Glu 
125 

gaa atg gat 
Glu Met Asp 



act cca gta 
Thr Pro Val 



tta gta gat 
Leu Val Asp 
175 

gtt caa tta 
Val Gin Leu 
190 

gta aca gta 
Val Thr Val 
205 

gaa gac ttc 
Glu Asp Phe 



gag aca cca 
Glu Thr Pro 



aaa gga tea 
Lys Gly Ser 
255 

cct ttt aga 
Pro Phe Arg 
270 

gat ttg tat 
Asp Leu Tyr 
285 

ata gat gaa 
He Asp Glu 



gac aaa aaa 
Asp Lys Lys 



etc cat cct 
Leu His Pro 
335 

gac age tgg 
Asp Ser Trp 
350 



act 288 
Thr 



aag 336 
Lys 



gaa 3 84 

Glu 



gga 432 
Gly 



ttt 480 

Phe 

160 

ttc 528 
Phe 



gga 576 
Gly 



ctg 624 
Leu 



aga 672 
Arg 



ggg 72 0 

Gly 

240 

cca 768 
Pro 



aaa 816 
Lys 



gta 864 
Val 



ctg 912 
Leu 



cat 960 

His 

320 

gat 1008 
Asp 



act 1056 
Thr 
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gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

<210> 6 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 6 

cct caa ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gat atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta agg cag tat gat caa ata etc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu lie 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga agg aat ctg ttg act cag att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gag gag 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa ate tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 4 80 

Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 



-7- 



180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aag tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tac act gca ttt act ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg ata aaa ate tta gag cct ttc aga aaa 816 
Ala lie Phe Gin Ser Ser Met lie Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac atg gtc ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Met Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gga cag cac aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aag tgg gga ttt ace aca cca gac aag aaa cat 96 0 

Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 08 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata aag ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Lys Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 



<210> 7 
<211> 1116 
<212> DNA 

<213> Human Iirtmunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<22 3> Portion of HIV Reverse Transcriptase 
<400> 7 

cct cag ate act ctt tgg caa cga ccc ctt gtc aca ata aar ata ggg 4 8 
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Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gag gaa atn aat tta cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Xaa Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ctt gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu Val 
50 55 60 

gaa aty tgt gga cat aar get ata ggt aca gta tta gta gga cct aca 24 0 

Glu Xaa Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

ccc gtc aac ata att gga aga aat ctg ttg act caa att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cc 9 99a- atg gat ggc ccc aaa gtt aaa cat ggc cct ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys His Gly Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aag cct tta gtt gaa att tgt aca gaa atg gga aaa gaa ggg 432 
Lys lie Lys Pro Leu Val Glu lie Cys Thr Glu Met Gly Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aaa aga act caa gac tty tgg gaa gtc caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc tea ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ser Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc ttg gat gaa gac tta gag 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Leu Glu 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 7 68 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 
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caa aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tea gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg ggg tgg ggg ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Gly Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca aca aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Thr Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aac tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 



<210> 8 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 8 

cct cag ate act ctt tgg caa cga ccc cty gtc aca gta aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr Val Lys lie Gly 
15 10 15 

ggg caa ata aag gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin lie Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag gta ccc ata 192 
Gly lie Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin Val Pro He 
50 55 60 

gaa ate tgt gga caa aaa get ata agt aca gta tta gta gga cct aca 240 
Glu He Cys Gly Gin Lys Ala He Ser Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aat ata att gga aga aat ctg atg act cag att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 
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tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa at a aaa gca tta gta gaa ate tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa ggc agt aac aga tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Gly Ser Asn Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg eta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt aca aac aat gag aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca caa gga tgg aaa ggg tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

age tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa eta 912 
Ser Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga tta acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Leu Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gar aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt caa 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
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355 



360 



365 



att tac cca ggg 
lie Tyr Pro Gly 
370 



1116 



<210> 9 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 9 

cct cag ate act ctt tgg caa cga ccc cty gtc aaa gta aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Lys Val Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata atw gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Val Asn lie Xaa Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt aca gag atg gag aag gaa ggg 432 
Lys lie Lys Ala Leu lie Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gay ttc tgg gaa gtt car tta gga 576 
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Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt gta aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Val Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aag ate tta gar cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tat caa tac atg gat gat ttg tat gta 8 64 

Gin Asn Pro Glu He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tew gac tta gaa ata ggg caa cat aga ata aaa ata gag gaa ctg 912 
Gly Xaa Asp Leu Glu He Gly Gin His Arg He Lys He Glu Glu Leu 
290 295 300 

aga cag cat ctg tta agg tgg ggg ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggk tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Xaa Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gay age tgg act 105 6 

Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

ate tac cca ggg 1116 
He Tyr Pro Gly 
370 



<210> 10 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 10 
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96 



cct cag ate act ctt tgg caa cga ccc etc gtc aca gta aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr Val Lys lie Gly 
15 10 15 

ggg caa ata aag gaa get yta tta gat aca gga gca gat gat aca gta 
Gly Gin He Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atw ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Xaa He Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag gta ccc ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin Val Pro He 
50 55 60 

gaa ate tgt gga caa aaa get ata agt aca gta tta gta gga cct aca 240 
Glu He Cys Gly Gin Lys Ala He Ser Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aat ata att gga aga aat ctg atg act cag att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa taa aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys * Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa ate tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 

gec ata aag aaa aaa ggc agt aac aga tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Gly Ser Asn Arg Trp Arg Lys Leu Val Asp Phe 
160 165 170 175 

aga gaa ctt aat aag aaa act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg eta aaa aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 

att aga tat cag tac aat gtg ctt ccm caa gga tgg aaa ggg tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Xaa Gin Gly Trp Lys Gly Ser Pro 
240 245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 
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caa aat cca gac wtr gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Xaa Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

age tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa eta 912 
Ser Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga tta acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Leu Thr Thr Pro Asp Lys Lys His 
305 310 315 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
320 325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gag aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt caa 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 
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<400> 


11 




cct cag ate act ctt tgg caa cga ccc aty gtt 




Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val 




1 


5 10 



15 



48 



ggg caa eta aaa raa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Xaa Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata gtg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Val 
35 40 45 

gga att gga ggt ttt gtc aaa gta aga cag tat gat cag gta ccc ata 192 
Gly lie Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin Val Pro lie 
50 55 60 

gag ate tgt ggg cat aaa att ata ggt aca gta tta ata gga cct acc 240 
Glu lie Cys Gly His Lys lie lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gee aac gta att gga aga aat ctg atg act cag ctt ggt tgc act 2 88 

Pro Ala Asn Val lie Gly Arg Asn Leu Met Thr Gin Leu Gly Cys Thr 
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Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
He Tyr Pro Gly 
370 



<210> 12 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> {298} . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 12 

cct caa ate act ctt tgg car cga ccc tta gtc aca ata aag ata ggg 4 8 

Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys He Gly 
15 10 15 

ggg caa eta aag gaa gec eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

eta gaa gaa atg aat ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Met He Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta agg cag tat gat car ata ccc ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gag ate tgc ggg tat aaa get gtg ggt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly Tyr Lys Ala Val Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act caa att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys He Lys Ala Leu He Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac ggt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Gly Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 
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aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta eta 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat caa gac ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Gin Asp Phe Arg 
210 215 220 

aag tat act gca ttc act ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac atg gtt ate tat caa tat atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Met Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

ggc tct gac tta gaa aya ggg cag cat aga rca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu Xaa Gly Gin His Arg Xaa Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aaa cat 96 0 

Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata atg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Met Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag eta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 



<210> 13 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 13 

cct cag ate act ctt tgg caa cga ccc aty gtc aac ata aag gta ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Asn lie Lys Val Gly 
15 10 15 

ggg caa eta arg gaa get eta ata gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Xaa Glu Ala Leu lie Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ata gat ttg cca gga aga tgg aga cca aga atg ata ggg 144 
Leu Glu Asp lie Asp Leu Pro Gly Arg Trp Arg Pro Arg Met lie Gly 
35 40 45 

gga att gga ggt ttt gtc aaa gta aag cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe Val Lys Val Lys Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ata tgt gga cat aaa gtt ata ggt aca gta tta gta gga cct acg 240 
Glu lie Cys Gly His Lys Val lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg ttg act cag att ggg tgc act 2 88 

Pro Ala Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aaa 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aag ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 48 0 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aag aaa aac agt act aga tgg aga aaa tta gta gat ttt 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttt tgt gaa gtg caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Cys Glu Val Gin Leu Gly 
180 185 190 

ata ccg cat ccc gca ggg tta ara aag aaa aga tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Xaa Lys Lys Arg Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tat act gec ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tat aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate eta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
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260 265 270 

caa aat cca grc ata gtt ate gtt caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Xaa He Val He Val Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

ggg tct gac tta gaa ata ggg caa cat aga gca aaa ata gag gag ttg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Ala Lys He Glu Glu Leu 
290 295 300 

aga gaa cat ctg ttg agg tgg gga tty ttc aca cca gac gaa aaa cat 960 
Arg Glu His Leu Leu Arg Trp Gly Phe Phe Thr Pro Asp Glu Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cac cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg acc gta cag cct ata aat ttg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Asn Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 . 365 

att tac tea ggg 1116 
He Tyr Ser Gly 
370 



<210> 14 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 14 

cct caa ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys He Gly 
15 10 15 

ggg caa gta agg gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Val Arg Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aaa tgg aag cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Met He Gly 
35 40 45 

gga att ggg ggc ttt ate aaa gta aga cag tat gat caa ata ccc ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggg aca gtg tta ata gga cct aca 24 0 

Glu He Cys Gly His Lys Ala He Gly Thr Val Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 2 88 
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gtc aat gac ata caa aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 15 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 15 

cct caa ate act ctt tgg car cga ccc etc gtt gca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Ala lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta kaa gaa atg gat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Xaa Glu Met Asp Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag gta tec wta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Val Ser Xaa 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta ata gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tat aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa ttg gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 



-22- 




-23- 




-24- 




-25- 



Pro Ala Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

eta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca aaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Lys Glu 
115 120 125 

aaa ata gaa gca tta gta gaa ate tgt gca gaa ctg gaa gag gca ggg 432 
Lys lie Glu Ala Leu Val Glu lie Cys Ala Glu Leu Glu Glu Ala Gly 
130 135 140 



aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 



480 



gec ata aag aar aag aac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aac aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 

lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttc tea att ccc tta gat aag gac ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser lie Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt aca ata cct agy ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Xaa lie Asn Asn Glu Thr Pro Gly 

225 230 235 240 

att aga tat cag tac aat gtg ctt cma cag gga tgg aaa gga tea cca 768 

lie Arg Tyr Gin Tyr Asn Val Leu Xaa Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc cag tgt age atg aca aaa ate tta gat cct ttt aga aaa 816 

Ala lie Phe Gin Cys Ser Met Thr Lys lie Leu Asp Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 

Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg car cat aga aca aaa ata gag gaa ctg 912 

Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa yat ctg tgg aag tgg gga ttt tac aca cca gag aat aaa cat 960 

Arg Gin Xaa Leu Trp Lys Trp Gly Phe Tyr Thr Pro Glu Asn Lys His 

305 310 315 320 

cag aaa gaa cct cca ttc cwt tgg atg ggt tat gaa etc cat cct gat 1008 

Gin Lys Glu Pro Pro Phe Xaa Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aaa tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gen ggg 1116 
lie Tyr Ala Gly 
370 



<210> 18 
<211> 1117 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1117) 

<223> Portion of HIV Reverse Transcriptase 
<400> 18 

cct caa ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg car eta aag gaa get eta tta gat aca gga gca gat gat aca gta 9 6 

Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

gta gaa gaa atg aat tta tea gga agg tgg aaa cca aaa atg ata ggg 144 
Val Glu Glu Met Asn Leu Ser Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga saa tat gaa cag ata cct gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Xaa Tyr Glu Gin lie Pro Val 
50 55 60 

gaa att tgt gga cat aaa get gta ggt aca gta tta gtg gga cct aca 240 
Glu lie Cys Gly His Lys Ala Val Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt ccc att gaa act gta cca gta aaa ttg aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc ccg aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 48 0 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt aat aaa tgg agg aaa tta gtg gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Asn Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 
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aga 
Arg 


gaa 
Glu 


ctt 
Leu 


aat 
Asn 
180 


aag 
Lys 


aga 
Arg 


act 
Thr 


caa 
Gin 


gac 
Asp 
185 


ttc 
Phe 


tgg 
Trp 


gaa 
Glu 


gtt 
Val 


caa 
Gin 
190 


tta 
Leu 


ggg 

Gly 


576 


ata 
He 


cca 
Pro 


cat 
His 
195 


ccy 
Xaa 


tea 
Ser 


ggg 

Gly 


tta 
Leu 


aaa 
Lys 
200 


aag 
Lys 


aaa 
Lys 


aaa 
Lys 


tea 
Ser 


gta 
Val 
205 


aca 
Thr 


gta 
Val 


ctg 
Leu 


624 


gat 
Asp 


gtg 
Val 
210 


ggt 
Gly 


gat 
Asp 


gca 
Ala 


tac 
Tyr 


ttt 
Phe 
215 


tea 
Ser 


gtt 

Val 


ccc 
Pro 


tta 
Leu 


gat 
Asp 
220 


aaa 
Lys 


gaa 
Glu 


ttc 
Phe 


agg 
Arg 


672 


aag 
Lys 
225 


tat 
Tyr 


act 
Thr 


gca 
Ala 


ttt 
Phe 


acc 
Thr 
230 


ata 
He 


cct 
Pro 


agt 
Ser 


aca 
Thr 


aac 
Asn 
235 


aat 
Asn 


gag 
Glu 


aca 
Thr 


cca 
Pro 


ggg 

Gly 
240 


720 


att 
He 


agr 
Xaa 


tat 
Tyr 


cag 
Gin 


tac 
Tyr 
245 


aat 
Asn 


gtg 
Val 


ctg 
Leu 


cca 
Pro 


cag 
Gin 
250 


gga 
Gly 


tgg 
Trp 


aaa 
Lys 


gga 
Gly 


tea 
Ser 
255 


cca 
Pro 


768 


gca 
Ala 


ata 
He 


ttc 
Phe 


caa 
Gin 
260 


agt 
Ser 


age 
Ser 


atg 
Met 


aca 
Thr 


aaa 
Lys 
265 


ate 
He 


tta 
Leu 


gag 
Glu 


cct 
Pro 


ttt 
Phe 
270 


aga 
Arg 


gaa 
Glu 


816 


caa 
Gin 


aat 
Asn 


aca 
Thr 
275 


gac 
Asp 


ata 

He 


gtt 
Val 


ate 
He 


tgt 
Cys 
280 


caa 
Gin 


tac 
Tyr 


atg 
Met 


gat 
Asp 


gat 
Asp 
285 


ttg 
Leu 


tat 
Tyr 


gta 
Val 


864 


gga 
Gly 


tct 
Ser 
290 


gac 
Asp 


tta 
Leu 


gaa 
Glu 


ata 
He 


ggg 

Gly 
295 


cag 
Gin 


cat 
His 


aga 
Arg 


gca 
Ala 


aaa 
Lys 
300 


gtr 
Xaa 


gag 
Glu 


gaa 
Glu 


ctg 
Leu 


912 


aga 
Arg 
305 


caa 
Gin 


cat 
His 


ctg 
Leu 


ttg 
Leu 


agg 
Arg 
310 


tgg 
Trp 


gga 

Gly 


yta 
Xaa 


acc 
Thr 


aca 
Thr 
315 


cca 
Pro 


gac 
Asp 


aaa 
Lys 


aaa 
Lys 


cat 
His 
320 


960 


cag 
Gin 


aaa 
Lys 


gaa 
Glu 


cct 
Pro 


cca 
Pro 
325 


ttc 
Phe 


cgt 
Arg 


tgg 
Trp 


atg 
Met 


ggk 
Xaa 
330 


tat 

Tyr 


gaa 
Glu 


etc 
Leu 


cat 
His 


cct 
Pro 
335 


gat 
Asp 


1008 


aaa 
Lys 


tgg 
Trp 


aca 
Thr 


gtr 
Xaa 
340 


caa 
Gin 


cct 
Pro 


ata 
He 


gag 
Glu 


ctg 
Leu 
345 


cca 
Pro 


gaa 
Glu 


aaa 

Lys 


gac 
Asp 


age 
Ser 
350 


tgg 
Trp 


act 
Thr 


1056 


gtc 
Val 


aat 
Asn 


gac 
Asp 
355 


ata 
He 


caa 
Gin 


aaa 
Lys 


gtt 

Val 


agt 
Ser 
360 


ggg 

Gly 


aaa 
Lys 


att 
He 


aaa 
Lys 


ttg 
Leu 
365 


ggc 
Gly 


aag 
Lys 


tea 
Ser 


1104 


gat 


tta 


ccc 


agg 


g 
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Asp Leu Pro Arg 
370 



<210> 19 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . - . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 19 

cct cag ate act ctt tgg caa cga ccc cty gtc aca gta aag ata ggg 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr Val Lys lie Gly 
15 10 15 

ggg caa eta acg gaa get yta ttg gat aca gga gca gat aat aca gta 
Gly Gin Leu Thr Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asn Thr Val 
20 25 30 

tta gaa gaa atg agt ttr cca gga aga tgg aaa cca aaa atg ata ggg 
Leu Glu Glu Met Ser Xaa Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 



48 



96 



144 



gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa gta gta ggt aca gta tta ata gga cct aca 24 0 

Glu lie Cys Gly His Lys Val Val Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga gat ctg ttg act cag att ggt tgc act 288 
Pro Val Asn lie lie Gly Arg Asp Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gtg aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ctg gaa aag gaa ggg 4 32 

Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 48 0 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aar gac agt act aaa tgg aga aaa ttr gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Xaa Val Asp Phe 
165 170 175 

aga gaa ctt aat aaa aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc tea ggg tta aaa aag aaa aaa tea gta aca gta eta 624 
lie Pro His Pro Ser Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gac gtg ggt gat gca tat ttc tea gtt ccc eta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttc ace ata cct agt gta aac aat gag act cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Val Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctg cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
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260 265 270 

cac aat cca aac ata gtt ate tat caa tac gtg gat gat tta tat gta 864 
His Asn Pro Asn lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa gta gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys Val Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aag tgg ggg ttt tac aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtg cag cct ata gtg ctg cca gaa aaa gac age tgg act 105 6 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata caa aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 20 
<211> 1117 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1117) 

<223> Portion of HIV Reverse Transcriptase 
<400> 20 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata gga 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ata aat ttg cca ggg aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp lie Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat caa ata cca gta 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Pro Val 
50 55 60 

gaa att tgt gga cat aaa get gta ggt aca gta tta ata gga cct aca 24 0 

Glu He Cys Gly His Lys Ala Val Gly Thr Val Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac gta att gga aga aat ctg atg act cag att ggc tgc act 288 



-30- 



Pro Val Asn Val lie Gly Arg Asn Leu Met Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 33 6 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggt cca aaa gtt aaa caa tgg cca tta aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgc aca gaa ttg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att gga cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 5 76 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea ata ccc tta gat gaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser lie Pro Leu Asp Glu Glu Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt cca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Pro Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat caa tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttt caa tgt agt atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Cys Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

gaa aat cca gat ata gtt ate tac caa tac atg gat gac tta tat gta 864 
Glu Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gat tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa tat ctg tgg aag tgg gga ttt tac aca cca gac aaa aaa cat 96 0 

Arg Gin Tyr Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag caa gaa cct cca ttc cgt tgg atg ggt tat gaa etc cat cct gat 10 08 

Gin Gin Glu Pro Pro Phe Arg Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 10 56 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aag ttt agt ggg aaa att gaa ttg ggc aag tea 1104 
Val Asn Asp He Gin Lys Phe Ser Gly Lys He Glu Leu Gly Lys Ser 
355 360 365 

gat tta tgc agg g 1117 
Asp Leu Cys Arg 
370 



<210> 21 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 



<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 21 

cct cag ate act ctt tgg caa cga trice gtt gtc wca ata aag ata ggg 4 8 

Pro Gin He Thr Leu Trp Gin Arg Xaa Val Val Xaa He Lys He Gly 
15 10 15 

ggg caa eta aaa gaa get eta tta gay aca ggg gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg cat ttg cca ggt aga tgg aaa cca aaa atg ata gtg 144 
Leu Glu Asp Met His Leu Pro Gly Arg Trp Lys Pro Lys Met He Val 
35 40 45 

gga att ggg ggt ttt gtc aaa gta aga cag tat gat cag ata cct gta 192 
Gly He Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin He Pro Val 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cca gee aac ata att gga aga aat ctg ttg act cag att ggt tgc act 288 
Pro Ala Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttc ccc ate agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 no 

cca gga atg gat ggc cca aaa att aga caa tgg cca tta aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys He Arg Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa ate tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa aat agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 
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aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta eta 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt atg aac aat gag aca cca gga 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Met Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca atg gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Met Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt agt atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

cag aat cca gac ata gtc ate tat caa tac atg gat gat tta tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga teg gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ttg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aga tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtg cag cct ata gtg ctg cca gaa aag gac age tgg act 105 6 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtt aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt caa 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 22 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 



-33- 



<400> 22 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag gta gga 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys Val Gly 
15 10 15 

ggg caa eta aag gag get eta tta gat aca gga gca gat gat aca gta 96 

Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ata gat ttg cca gga agr tgg aaa cca aaa atg ata ggg 144 

Leu Glu Asp lie Asp Leu Pro Gly Xaa Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 192 

Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ata tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 24 0 

Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 

65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act egg att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Arg lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 

Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 

145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttt 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gat ttc tgg gaa gtg caa tta gga 576 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata ccg cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 

lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aar gay ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gee ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 

225 230 235 240 

att aga tat cag tat aat gtg ctt cca cag gga tgg aaa gga tea cca 768 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate eta gag cct ttt aga aaa 816 

Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
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260 265 270 

caa aat cca gac atg gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Met Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

ggg tct gac tta gaa at a ggg cag cat aga aca aaa at a gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga gaa cat ctg ttg agg tgg gga ttt acc acc cca gac aaa aaa cat 96 0 

Arg Glu His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gag cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg acc gtr cag cct ata gag ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Xaa Gin Pro lie Glu Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 



<210> 23 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 23 

cct cag ate act ctt tgg caa cga ccc ata gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta ata gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu lie Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ata aat ttg cca gga aga tgg aaa cca aaa tta ata ggg 144 
Leu Glu Asp lie Asn Leu Pro Gly Arg Trp Lys Pro Lys Leu lie Gly 
35 40 45 

gga att gga ggt ttt gtc aga gtg aaa cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe Val Arg Val Lys Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa att tgt gga cat aaa gtt ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Val lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 
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Pro Ala Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta aca gaa ate tgt wca gag atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Thr Glu lie Cys Xaa Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aac act cca gta ttt 4 80 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gey ata cac aag aaa aat agt aat aga tgg aga aaa gta gta gat ttc 52 8 

Xaa lie His Lys Lys Asn Ser Asn Arg Trp Arg Lys Val Val Asp Phe 
165 170 175 

agg gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 5 76 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca gga tta aaa aag aac aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aag gat ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gcg ttt ace ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

ate aga tac cag tac aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aga ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Arg lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tgt caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Glu lie Val lie Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275- 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata aak gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Xaa Glu Leu 
290 295 300 

aga saa cat ctg ttg agg tgg gga ttt ttc aca cca gac caa aaa cat 960 
Arg Xaa His Leu Leu Arg Trp Gly Phe Phe Thr Pro Asp Gin Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aar gac agt tgg acw 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Xaa 
340 345 350 
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gty aat gac ata cag aaa tta gtk gga aaa ttg aat tgg gca agt caa 1104 
Xaa Asn Asp He Gin Lys Leu Xaa Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
He Tyr Pro Gly 
370 



<210> 24 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 24 

cct cag ate act ctt tgg caa cga ccc ata gtc aca ata aag ata ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro He Val Thr He Lys He Gly 
15 10 15 

ggg caa eta aag gaa get eta ata gat aca gga gca gat gat aca gta 9 6 

Gly Gin Leu Lys Glu Ala Leu He Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ata aat ttg cca gga aga tgg aaa cca aaa tta ata ggg 144 
Leu Glu Asp He Asn Leu Pro Gly Arg Trp Lys Pro Lys Leu He Gly 
35 40 45 

gga att gga ggt ttt gtc aga gtg aaa cag tat gat cag ata ccc ata 192 
Gly He Gly Gly Phe Val Arg Val Lys Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa att tgt gga cat aaa gtt ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Val He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Ala Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta aca gaa ate tgt wca gag atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Thr Glu He Cys Xaa Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aac act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gey ata cac aag aaa aat agt aat aga tgg aga aaa gta gta gat ttc 52 8 

Xaa He His Lys Lys Asn Ser Asn Arg Trp Arg Lys Val Val Asp Phe 
165 170 175 
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agg 
Arg 


gaa 
Glu 


ctt 
Leu 


aat 
Asn 
180 


aag 
Lys 


aga 
Arg 


act 
Thr 


caa 
Gin 


gac 
Asp 
185 


ttc 
Phe 


tgg 
Trp 


gaa 
Glu 


gtt 
Val 


caa 
Gin 
190 


tta 
Leu 


gga 
Gly 


576 




ata 
He 


cca 
Pro 


cat 
His 
195 


ccc 
Pro 


gca 

Ala 


gga 

Gly 


tta 
Leu 


aaa 
Lys 
200 


aag 
Lys 


aac 
Asn 


aaa 
Lys 


tea 
Ser 


gta 
Val 
205 


aca 
Thr 


gta 
Val 


ctg 
Leu 


624 




gat 
Asp 


gtg 
Val 
210 


ggt gat 
Gly Asp 


gca 
Ala 


tat 
Tyr 


ttt 
Phe 
215 


tea 
Ser 


gtt 

Val 


ccc 
Pro 


tta 
Leu 


gat 
Asp 
220 


aag 
Lys 


gat 
Asp 


ttc 
Phe 


agg 
Arg 


672 




aag 
Lys 
225 


tat 
Tyr 


act 
Thr 


gcg 
Ala 


ttt 
Phe 


acc 
Thr 
230 


ata 
He 


cct 
Pro 


agt 
Ser 


ata 
He 


aac 
Asn 
235 


aat 
Asn 


gag 
Glu 


aca 
Thr 


cca 
Pro 


ggg 

Gly 
240 


720 




ate 
He 


aga 
Arg 


tac 
Tyr 


cag 
Gin 


tac 
Tyr 
245 


aat 
Asn 


gtg 
Val 


ctt 
Leu 


cca 
Pro 


caa 
Gin 
250 


gga 
Gly 


tgg 
Trp 


aaa 
Lys 


gga 
Gly 


tea 
Ser 
255 


cca 
Pro 


768 




gca 
Ala 


ata 
He 


ttc 
Phe 


caa 
Gin 
260 


agt 
Ser 


age 
Ser 


atg 
Met 


aca 
Thr 


aga 
Arg 
265 


ate 
He 


tta 
Leu 


gag 
Glu 


cct 
Pro 


ttt 
Phe 
270 


aga 
Arg 


aaa 
Lys 


816 




caa 
Gin 


aat 
Asn 


cca 
Pro 
275 


gaa 
Glu 


ata 
He 


gtt 
Val 


ate 
He 


tgt 

Cys 
280 


caa 
Gin 


tac 
Tyr 


atg 
Met 


gat 

Asp 


gat 
Asp 
285 


ttg 
Leu 


tat 
Tyr 


gta 
Val 


864 


m 


gga 
Gly 


tct 
Ser 
290 


gac 
Asp 


tta 
Leu 


gaa 
Glu 


ata 
He 


ggg 

Gly 
295 


cag 
Gin 


cat 
His 


aga 
Arg 


aca 
Thr 


aaa 
Lys 
300 


ata 
He 


aak 
Xaa 


gaa 
Glu 


ctg 
Leu 


912 


M 


aga 
Arg 
305 


saa 
Xaa 


cat 
His 


ctg 
Leu 


ttg 

Leu 


agg 
Arg 
310 


tgg 
Trp 


gga 
Gly 


ttt 
Phe 


ttc 
Phe 


aca 
Thr 
315 


cca 
Pro 


gac 
Asp 


caa 
Gin 


aaa 
Lys 


cat 
His 
320 


960 


3 


cag 
Gin 


aaa 
Lys 


gaa 
Glu 


cct 
Pro 


cca 
Pro 
325 


ttc 
Phe 


ctt 
Leu 


tgg 
Trp 


atg 
Met 


ggt 
Gly 
330 


tat 
Tyr 


gaa 
Glu 


etc 
Leu 


cat 
His 


cct 
Pro 
335 


gat 
Asp 


1008 




aaa 
Lys 


tgg 
Trp 


aca 
Thr 


gta 
Val 
340 


cag 
Gin 


cct 
Pro 


ata 
He 


gtg 

Val 


ctg 
Leu 
345 


cca 
Pro 


gaa 
Glu 


aar 
Lys 


gac 
Asp 


agt 
Ser 
350 


tgg 
Trp 


acw 
Xaa 


1056 




gty 
Xaa 


aat 
Asn 


gac 
Asp 
355 


ata 
He 


cag 
Gin 


aaa 
Lys 


tta 
Leu 


gtk 
Xaa 
360 


gga 
Gly 


aaa 
Lys 


ttg 
Leu 


aat 
Asn 


tgg 
Trp 
365 


gca 
Ala 


agt 
Ser 


caa 
Gin 


1104 




att 
He 


tac 
Tyr 


cca 
Pro 


ggg 

Gly 


























1116 
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<210> 25 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 

-38- 



<400> 25 

cct caa ate act ctt tgg caa cga ccc etc gtc aca ata aaa ata ggg 48 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta eta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg agt ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Ser Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag gta tec atg 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Val Ser Met 
50 55 60 

gaa ate tgt gga cat aaa gtt ata ggt aca gta tta gta gga tct aca 240 
Glu lie Cys Gly His Lys Val lie Gly Thr Val Leu Val Gly Ser Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ytg ttg act cag ctt ggg tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Xaa Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt aca gaa atg gaa aag gar ggg 432 
Lys lie Lys Ala Leu lie Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aaa tgg aga aag tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
1^5 170 175 

aga gaa ctt aat aag aaa act caa gat ttc tgg gaa rtt caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Xaa Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta caa aag aac aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Gin Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtc ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 2 30 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
2 45 250 255 

gca ata ttc caa tat age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Tyr Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
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260 265 270 

caa aat cca gac ata gtt ate tac caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gaa cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga cag cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc etc tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtt cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 26 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 26 

cct cag ate act ctt tgg caa cga ccc ate gtc gaa ata aag gta ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Glu lie Lys Val Gly 
15 10 15 

ggg caa eta ata gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu lie Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa ata aat tta cca gga aga tgg aaa cca aga atg ata ggg 144 
Leu Glu Glu lie Asn Leu Pro Gly Arg Trp Lys Pro Arg Met lie Gly 
35 40 45 

gga att gga ggt ttt gtc aaa gta aga cag tat gat cag gta cct ate 192 
Gly lie Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin Val Pro He 
50 55 60 

gaa ate tgt gga cat aaa gtt ata agt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly His Lys Val He Ser Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga aat ctg atg act cag att ggt tgc act 288 
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Pro Ala Asn lie lie Gly Arg Asn Leu Met Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aaa 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ytg gaa gag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Xaa Glu Glu Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca ata ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro lie Phe 
145 150 155 160 

gec ata aag aag aaa nnn agt ggt aga tgg aga aaa ata gta gat ttt 52 8 

Ala lie Lys Lys Lys Xaa Ser Gly Arg Trp Arg Lys lie Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gat ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aac aag tea gta aca att ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr lie Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aag gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aat aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 



caa aat cca gac ata gtt ate tat cag tac gtg gat gat ttg tat gta 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 



864 



gga tct gat tta gaa ata ggg gag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Glu His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga car cat ctg tta arg tgg gga ttt ttc aca cca gaa caa aaa cat 960 
Arg Gin His Leu Leu Xaa Trp Gly Phe Phe Thr Pro Glu Gin Lys His 
305 310 315 320 

cag aaa gaa cct con ttc cak tgg atg ggt tat gaa etc cay cct gat 10 0 8 

Gin Lys Glu Pro Xaa Phe Xaa Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cas cct ata gtg ctg cca gaa aaa gat age tgg act 1056 
Lys Trp Thr Val Xaa Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 



<210> 27 
<211> 1113 
<212> DNA 

<213> Human Imnumodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 27 

cct cag ate act ctt tgg caa cga ccc ate gtc gaa ata aag gta ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Glu lie Lys Val Gly 
15 10 15 

ggg caa eta ata gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu lie Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa ata aat tta cca gga aga tgg aaa cca aga atg ata ggg 144 
Leu Glu Glu lie Asn Leu Pro Gly Arg Trp Lys Pro Arg Met lie Gly 
35 40 45 

gga att gga ggt ttt gtc aaa gta aga cag tat gat cag gta cct ate 192 
Gly lie Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin Val Pro He 
50 55 60 

gaa ate tgt gga cat aaa gtt ata agt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly His Lys Val He Ser Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg atg act cag att ggt tgc act 288 
Pro Ala Asn He He Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aaa 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ytg gaa gag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Xaa Glu Glu Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca ata ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 160 

gee ata aag aag aaa agt ggt aga tgg aga aaa ata gta gat ttt aga 52 8 

Ala He Lys Lys Lys Ser Gly Arg Trp Arg Lys He Val Asp Phe Arg 
165 170 175 
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gaa ctt aat aag aga act caa gat ttc tgg gaa gtt caa tta gga ata 576 
Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly lie 
180 185 190 

cca cat ccc gca ggg tta aaa aag aac aag tea gta aca att ctg gat 624 
Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr lie Leu Asp 
195 200 205 

gtg ggt gat gca tat ttt tea gtt ccc tta gat aag gaa ttc agg aag 672 
Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg Lys 
210 215 220 

tat act gca ttt acc ata cct agt ata aat aat gag aca cca ggg att 72 0 

Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly He 
225 230 235 240 

aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca gca 76 8 

Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro Ala 
245 250 255 

ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa caa 816 
He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys Gin 
260 265 270 

aat cca gac ata gtt ate tat cag tac gtg gat gat ttg tat gta gga 864 
Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val Gly 
275 280 285 

tct gat tta gaa ata ggg gag cat aga aca aaa ata gag gaa ctg aga 912 
Ser Asp Leu Glu He Gly Glu His Arg Thr Lys He Glu Glu Leu Arg 
290 295 300 

car cat ctg tta arg tgg gga ttt ttc aca cca gaa caa aaa cat cag 960 
Gin His Leu Leu Xaa Trp Gly Phe Phe Thr Pro Glu Gin Lys His Gin 
305 310 315 320 

aaa gaa cct ccm ttc cak tgg atg ggt tat gaa etc cay cct gat aaa 1008 
Lys Glu Pro Xaa Phe Xaa Trp Met Gly Tyr Glu Leu His Pro Asp Lys 
325 330 335 

tgg aca gta cas cct ata gtg ctg cca gaa aaa gat age tgg act gtc 1056 
Trp Thr Val Xaa Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr Val 
340 345 350 

aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag att 1104 
Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin He 
355 360 365 

tac cca ggg 1113 
Tyr Pro Gly 
370 

<210> 28 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 28 

cct caa ate act stt tgg caa cga ccc aty gtc tea ata aag ata ggg 4 8 

Pro Gin lie Thr Xaa Trp Gin Arg Pro Xaa Val Ser lie Lys He Gly 
15 10 15 

ggg caa ata aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin He Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aag cca aaa atg ata gtg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Val 
35 40 45 

gga att gga ggt ttt age aaa gta aga caa tat gat cag ata ccc ata 192 
Gly He Gly Gly Phe Ser Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa ate tgc gga cgt aaa gtt gta ggt tea gta tta ata gga cct aca 24 0 

Glu He Cys Gly Arg Lys Val Val Gly Ser Val Leu He Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg ttg act cag ctt ggc tgt act 2 88 

Pro Ala Asn He He Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct atk gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro Xaa Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca aaa gag 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Lys Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt aca gaa ttg gaa gaa gma gga 43 2 

Lys He Lys Ala Leu He Glu He Cys Thr Glu Leu Glu Glu Xaa Gly 
130 135 140 

aaa att aca aaa att ggg cct gaa aat ccg tac aat act cca ata ttt 480 
Lys He Thr Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 160 

gee ata aag aaa aar aac agt act aaa tgg aga aaa tta gta gac ttc 52 8 

Ala He Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

^99 gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea att ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser He Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aar tat act gca ttt ace ata cct agt acg aat aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tat aat gtg ctt cca cag gga tgg aaa gga tea cca 76 8 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
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260 265 270 

caa aat cca gac ata gtt ate tat caa tac gtg gat gat ttg ctt gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Leu Val 

275 280 285 

gga tct gac tta gaa ata gag cag cat aga aca aaa ata gag gag eta 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg aag tgg gga ttt tac aca cca gac aaa aaa cat 960 
Arg Gin His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gka cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Xaa Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata caa aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 29 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 29 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta ata gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu lie Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ctt gtc aaa gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Leu Val Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa gtt ata ggt aca gtw tta gta gga cct aca 24 0 

Glu lie Cys Gly His Lys Val lie Gly Thr Xaa Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 2 88 
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Pro Ala Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca ggg atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gag aag gag gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aag att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aag aac agt act agg tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aac aaa tea gca aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Ala Thr Val Leu 
195 200 205 

gat gtg ggc gat gca tat ttt tea gtt ccc tta gac aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt acy ata cct agt ata aac aat gaa aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Xaa lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

tar ata tea gtg tac aat gtr ctt cca caa gga tgg aaa gga tea cma 768 
Xaa lie Ser Val Tyr Asn Xaa Leu Pro Gin Gly Trp Lys Gly Ser Xaa 
245 250 255 

gca ata ttc maa agt age atg aca aga ate tta gag cct ttt aga aaa 816 
Ala lie Phe Xaa Ser Ser Met Thr Arg lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Glu lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa gta gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys Val Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt ttc aca cca gac caa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Phe Thr Pro Asp Gin Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac gen ggg 1116 
lie Tyr Ala Gly 
370 



<210> 30 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 30 

cct caa ate act ctt tgg caa cga ccc cty gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg age tta cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Ser Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggk ttt ate aaa gtg agm cag tat gat cag ata etc ata 192 
Gly He Gly Xaa Phe He Lys Val Xaa Gin Tyr Asp Gin He Leu He 
50 55 60 

gaa aty tgt gga cat aaa get ata ggt aca gtr tta ata gga cct aca 24 0 

Glu Xaa Cys Gly His Lys Ala He Gly Thr Xaa Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa ttg aaa 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtc aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggr 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Xaa 
130 135 140 

aaa att aca aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Thr Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aag aaa aac agt gat aaa tgg aga aaa tta gta gat ttc 528 
Ala He Lys Lys Lys Asn Ser Asp Lys Trp Arg Lys Leu Val Asp Phe 
1^5 170 175 
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aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cca gca ggg tta aaa cag aaa aag tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Gin Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gta ccc tta gat gaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt gta aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Val Asn Asn Glu Thr Pro Gly 
225 230 235 240 

gtt aga tat cag tac aat gta etc cca cag gga tgg aaa gga tea cca 76 8 

Val Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttt caa agt age atg aca aaa ate tta gag cct ttt agg aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Glu lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttc tac aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gta ggg aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca gga 1116 
lie Tyr Ala Gly 
370 

<210> 31 
<211> 1117 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 31 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa tta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

eta gaa gac gtg cat ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Val His Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat gag gta ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Glu Val Pro lie 
50 55 60 

gaa etc tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 24 0 

Glu Leu Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

ccc gtc aac ata att gga aga aat ctg wtg act caa ctt ggg tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Xaa Thr Gin Leu Gly Cys Thr 
85 90 95 

eta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 3 36 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 4 32 

Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aga gtt ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Arg Val Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gyc ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Xaa lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cay ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctr 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Xaa 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tac cag tac aat gtg ctt cca caa gga tgg aaa gga tea cca 76 8 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gat cct ttt agg aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Asp Pro Phe Arg Lys 
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260 265 270 

caa aac cca gac ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tcy gac tta gaa ata gga cag cat agr rca aaa ata gaa gaa ctg 912 
Gly Xaa Asp Leu Glu lie Gly Gin His Xaa Xaa Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aag tgg gga ttt acc aca cca gac aag aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

car aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 10 08 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtg cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ant aca gaa gtt agt ggg aaa att gaa ttg ggc aag tea 1104 
Val Asn Asp Xaa Thr Glu Val Ser Gly Lys lie Glu Leu Gly Lys Ser 
355 360 365 

gat tta tgc agg g 1117 
Asp Leu Cys Arg 
370 



<210> 32 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 32 

cct caa ate act ctt tgg caa cga ccc cty gtc gca ata agg ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Ala lie Arg lie Gly 
15 10 15 

ggg caa eta aag gaa gec eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg gag ttg cca gga aga tgg aag cca aaa atg ata ggg 144 
Leu Glu Asp Met Glu Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aam cag tat gat cag ata ctt gta 192 
Gly lie Gly Gly Phe He Lys Val Xaa Gin Tyr Asp Gin He Leu Val 
50 55 60 

gaa ate tgt gga cat aaa get gta ggt aca gta tta ata gga cct aca 240 
Glu He Cys Gly His Lys Ala Val Gly Thr Val Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ttg ttg act cag att ggc tgc act 288 
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Pro Val Asn 



tta aat ttt 
Leu Asn Phe 



cca gga atg 
Pro Gly Met 
115 

aaa ata aaa 
Lys lie Lys 
130 

aaa att tea 
Lys lie Ser 
145 

get ata aag 
Ala lie Lys 



aga gaa ctt 
Arg Glu Leu 



ata cca cat 
lie Pro His 
195 

gat gtg ggt 
Asp Val Gly 
210 

aag tat act 
Lys Tyr Thr 
225 

att aga tat 
lie Arg Tyr 



gca ata ttt 
Ala lie Phe 



caa aat cca 
Gin Asn Pro 
275 

ggc tct gac 
Gly Ser Asp 
290 

aga cag cat 
Arg Gin His 
305 

cag aaa gaa 
Gin Lys Glu 



aaa tgg aca 
Lys Trp Thr 



lie He Gly 
85 

ccc att agt 
Pro He Ser 
100 

gat ggc cca 
Asp Gly Pro 



gca tta gta 
Ala Leu Val 



aaa att ggg 
Lys He Gly 
150 

aaa aaa gac 
Lys Lys Asp 
165 

aat aaa aga 
Asn Lys Arg 
180 

ccc gca ggg 
Pro Ala Gly 



gat gca tat 
Asp Ala Tyr 



gca ttt acc 
Ala Phe Thr 
230 

cag tac aat 
Gin Tyr Asn 
245 

caa age age 
Gin Ser Ser 
260 

gac wta gtt 
Asp Xaa Val 



tta gaa ata 
Leu Glu He 



ctg tgg aag 
Leu Trp Lys 
310 

cct cca ttt 
Pro Pro Phe 
325 

gta cag cct 
Val Gin Pro 
340 



Arg Asn Leu 



cct att gaa 
Pro lie Glu 
105 

aaa gtt aaa 
Lys Val Lys 
120 

gaa ate tgt 
Glu He Cys 
135 

cct gaa aat 
Pro Glu Asn 



agt act aaa 
Ser Thr Lys 



act caa gac 
Thr Gin Asp 
185 

tta aaa aag 
Leu Lys Lys 
200 

ttt tea gtt 
Phe Ser Val 
215 

aya cct sgt 
Xaa Pro Xaa 



gtg ctt cca 
Val Leu Pro 



atg aca aaa 
Met Thr Lys 
265 

wtc tat caa 
Xaa Tyr Gin 
280 

ggg cag cat 
Gly Gin His 
295 



tgg ggg ttt 
Trp Gly Phe 



ctt tgg atg 
Leu Trp Met 



ata atg ctg 
He Met Leu 
345 



Leu Thr Gin 
90 

act gta cca 
Thr Val Pro 



caa tgg cca 
Gin Trp Pro 



aca gaa ttg 
Thr Glu Leu 
140 

cca tac aat 
Pro Tyr Asn 
155 

tgg aga aaa 
Trp Arg Lys 
170 

ttt tgg gaa 
Phe Trp Glu 



aaa aaa tec 
Lys Lys Ser 



ccc tta gat 
Pro Leu Asp 
220 

ata aac aat 
He Asn Asn 
235 

cag gga tgg 
Gin Gly Trp 
250 

ate tta gag 
He Leu Glu 



twe ata gat 

Xaa He Asp 



aga aca aaa 
Arg Thr Lys 
300 

tac aca cca 
Tyr Thr Pro 
315 

ggt tat gaa 
Gly Tyr Glu 
330 

cca gaa aaa 
Pro Glu Lys 



He Gly Cys 
95 

gta aaa tta 

Val Lys Leu 
110 

ttg aca gaa 
Leu Thr Glu 
125 

gaa aag gaa 
Glu Lys Glu 



act cca gta 
Thr Pro Val 



tta gta gat 
Leu Val Asp 
175 

gtt caa tta 
Val Gin Leu 
190 

gtg aca gta 
Val Thr Val 
205 

aaa gac ttt 
Lys Asp Phe 



gag aca cca 
Glu Thr Pro 



aaa gga tec 
Lys Gly Ser 
255 

cct ttt aga 
Pro Phe Arg 
270 

gat ctg tat 
Asp Leu Tyr 
285 

ata gag gaa 
He Glu Glu 



gac aaa aaa 
Asp Lys Lys 



etc cat cct 
Leu His Pro 
335 

gac age tgg 
Asp Ser Trp 
350 



Thr 



aag 336 
Lys 



gag 3 84 

Glu 



gga 432 
Gly 



ttt 480 

Phe 

160 

ttc 528 
Phe 



gga 5 7 6 

Gly 



ctg 624 
Leu 



aga 672 
Arg 



ggg 72 o 

Gly 

240 

cca 768 
Pro 



aaa 816 
Lys 



gta 864 
Val 



ctg 912 
Leu 



cat 960 

His 

320 

gat 1008 
Asp 



act 1056 
Thr 
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gtc aat gac ata cag aar tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
He Tyr Pro Gly 
370 



<210> 33 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 33 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys He Gly 
1 5 10 15 

ggg caa eta aag gaa get eta tta kat aca gga gca gat gat aca gtm 96 
Gly Gin Leu Lys Glu Ala Leu Leu Xaa Thr Gly Ala Asp Asp Thr Xaa 
20 25 30 

tta gaa gac atg act ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Met Thr Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aaa cag tat gag gag ata ccc ata 192 
Gly He Gly Gly Phe He Lys Val Lys Gin Tyr Glu Glu He Pro He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ttg ttg act cag att ggt tgc act 2 88 

Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aaa 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca ttw gta gaa att tgt gca gaa ctg gaa aag gaa ggg 432 
Lys He Lys Ala Xaa Val Glu He Cys Ala Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac ggt act aaa tgg aga aag gta aca gat ttt 52 8 

Ala He Lys Lys Lys Asp Gly Thr Lys Trp Arg Lys Val Thr Asp Phe 
165 170 175 
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aga gaa ctt aat aag agg ach caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Xaa Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc tea ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ser Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gcg aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Ala Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac atg gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Met Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gat tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aag tgg ggt ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat tea ggg 1116 
He Tyr Ser Gly 
370 

<210> 34 
<211> 1119 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 34 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag ata ggg 48 

Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta ttr gac aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Xaa Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ttt att aaa gta aaa cag tat gaa cag ata ace ata 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Glu Gin lie Thr lie 
50 55 60 

gam ate tgt gga cat aaa get aca ggt aca gta tta gta gga cct aca 24 0 

Xaa lie Cys Gly His Lys Ala Thr Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac gta att gga aga aat atg atg act cag att ggt tgc act 288 
Pro Val Asn Val lie Gly Arg Asn Met Met Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aac aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta cca aag aac aaa tea gta acg gta ctg 624 
lie Pro His Pro Ala Gly Leu Pro Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt cct tta gat gaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tac act gca ttt acc ata cct agg tat aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Arg Tyr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

act aga tat cag tac aat gtg ctt cct atg gga tgg aaa gga tea cca 768 
Thr Arg Tyr Gin Tyr Asn Val Leu Pro Met Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aga 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Arg 
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260 265 270 

caa aat cca gac ata gtt ate tat caa tac gtg gat gac ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gag ata ggg cag cat aga gcg aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Glu Leu 
290 295 300 

aga gaa cat ctg tgg aag tgg ggt ttt tac aca cca gac aaa aaa cat 960 
Arg Glu His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc cat tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe His Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aaa tta gtg ggr aaa att gaa ttt ggg cga gtc 1104 
Val Asn Asp lie Gin Lys Leu Val Xaa Lys lie Glu Phe Gly Arg Val 
355 360 365 

aga ttt amc caa ggg 1119 
Arg Phe Xaa Gin Gly 
370 



<210> 35 
<211> 1115 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1115) 

<223> Portion of HIV Reverse Transcriptase 
<400> 35 

cct cag ate act ctt tgg caa cga ccc cty gtc cca ata arg ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Pro lie Xaa lie Gly 
15 10 15 

ggg caa tta aag gaa get eta eta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg aat tta cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aar gta aaa cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt ggg cat aaa get ata ggt aca gta tta gta gga cct aca 24 0 

Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 28 8 
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Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

eta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 33 6 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att gga cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aag gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttt tgg gaa gtc caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta tta 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg gga gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aag 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtc ata tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

ggg tct gac tta gaa ata gga cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cac ttg ttg maa tgg gga ttc acc aca cca gac aaa aag cat 960 
Arg Gin His Leu Leu Xaa Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata kaa ctg cca gaa aaa gac age tgg ctg 1056 
Lys Trp Thr Val Gin Pro lie Xaa Leu Pro Glu Lys Asp Ser Trp Leu 
340 345 350 
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tea atg aca tac aga aat tag tgg gaa agt tga att ggg caa gtc aaa 1104 
Ser Met Thr Tyr Arg Asn * Trp Glu Ser * lie Gly Gin Val Lys 
355 360 365 

ttt atg cng gg 1115 
Phe Met Xaa 



<210> 36 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 36 

cct cag ate act ctt tgg caa cga cca gtc gtc aca ata aag gta ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Val Val Thr lie Lys Val Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt rtc aaa gta aga cag tat gat caa ata ccc ata 192 
Gly lie Gly Gly Phe Xaa Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa get aca ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala Thr Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gyc aac ata att gga aga aat ctg ttg act cag att ggg tgc act 2 88 

Pro Xaa Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ctg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt gca gaa ttg gaa aag gaa ggg 4 32 

Lys lie Lys Ala Leu Val Glu lie Cys Ala Glu Leu Glu Lys Glu Gly 
130 135 140 

aag att tea aaa att ggg ccy gaa aat cca tac aay act cca gta ttt 480 
Lys lie Ser Lys lie Gly Xaa Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aar aac agt act ara tgg aga aaa kta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Xaa Trp Arg Lys Xaa Val Asp Phe 
165 170 175 
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• 



aga gaa ctt aat aag aga act caa gat ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg eta aag aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc ttg gat gaa gac ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tat aca gec ttt acc tat act ggt tec aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr Tyr Thr Gly Ser Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat car tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa age age atg aca aaa gtc tta gaa cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys Val Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tgt caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tta agg tgg gga ttt tac aca cca gac gaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Tyr Thr Pro Asp Glu Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gac 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtt aat gac ata cag aaa tta gtg gga aaa ttg aat tgg gee agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 



<210> 37 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 37 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aaa ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag gta ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Val Pro He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg atg aca cag ctt ggt tgt act 2 88 

Pro Val Asn He He Gly Arg Asn Leu Met Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

a 99 gaa ctt aat aag aaa act caa gac ttc tgg gaa gtt caa tta ggg 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca gga tta aaa aag aat aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa att tta gat cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Asp Pro Phe Arg Lys 
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260 265 270 

cag aat cca gat ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gag ata ggg cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Glu Leu 
290 295 300 

aga gca cat ctg ttg aag tgg gga ttt acc acc cca gac aaa aaa cat 960 
Arg Ala His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac gca ggg 1116 
He Tyr Ala Gly 
370 



<210> 38 

<211> 1117 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1117) 

<223> Portion of HIV Reverse Transcriptase 
<400> 38 

cct caa tea ctt ctt tgg caa cga ccc mtc gtc aca ata aag gta ggg 4 8 

Pro Gin Ser Leu Leu Trp Gin Arg Pro Xaa Val Thr He Lys Val Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca ata 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr He 
20 25 30 

tta gaa gac aya rat ttg cca ggg aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Asp Xaa Xaa Leu Pro Gly Arg Trp Lys Pro Lys He He Gly 
35 40 45 

gga att gga ggt ttt ate aga gta aga cag tat gat cag gta ccc ata 192 
Gly He Gly Gly Phe He Arg Val Arg Gin Tyr Asp Gin Val Pro He 
50 55 60 

gaa ate tgt gga cat aaa gtt gta agt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Val Val Ser Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga aat ctg atg act cag att ggt tgc act 288 
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Pro Ala Asn lie lie Gly Arg Asn Leu Met Thr Gin lie Gly Gys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt gaa gaa ttg gaa aag gat ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Glu Glu Leu Glu Lys Asp Gly 
130 135 140 



aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 



480 



gec ata aag aaa aag aac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca gga tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea att ccc tta gat gaa gac ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser lie Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

tea ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ser lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtc ate tat caa tat atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gag ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga cag cat ctg tgg aag tgg ggg ttt tac aca cca gac ara aaa cat 960 
Arg Gin His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Xaa Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gac 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 105 6 

Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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# 



gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tan tsc agg g 1117 
lie Xaa Xaa Arg 
370 



<210> 39 
<211> 1128 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1128) 

<223> Portion of HIV Reverse Transcriptase 
<400> 39 

cct cag ate act ctt tgg caa cga cca ttc gtc aca ata aaa ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Phe Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get ata tta gac aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala lie Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt mtc aaa gta aga cag tat gat cag gta ccc ata 192 
Gly lie Gly Gly Phe Xaa Lys Val Arg Gin Tyr Asp Gin Val Pro lie 
50 55 60 

gaa ate tgt gga cat aaa gtt atg agt aca gta tta ata gga cct aca 240 
Glu lie Cys Gly His Lys Val Met Ser Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg atg act cag mtt ggc tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Met Thr Gin Xaa Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gwa cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Xaa Pro Val Lys Leu Lys 
100 105 110 

cca ggg atg gac ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 4 80 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt aat aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Asn Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 
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aga gaa ctt aat aag aga act caa gat ttc tgg gaa gtc caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aac aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat raa gat tea gra 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Xaa Asp Ser Xaa 
210 215 220 

agt aca ctg cat tta cca tac eta gta cgr acc aat gag aca cca ggg 72 0 

Ser Thr Leu His Leu Pro Tyr Leu Val Xaa Thr Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat caa tac aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac tta gtt ate tgt caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Leu Val lie Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gat tta gaa ata gag cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg aag tgg ggg ttt tac aca cca gac aaa aaa cat 960 
Arg Gin His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc cgt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Arg Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta caa gee tat aaa get gee aga aaa aga cag ctg gac 1056 
Lys Trp Thr Val Gin Ala Tyr Lys Ala Ala Arg Lys Arg Gin Leu Asp 
340 345 350 

tgt caa tga cat tac mag aaa gtt agt ggg gaa aat tgg aat ttg ggg 1104 
Cys Gin * His Tyr Xaa Lys Val Ser Gly Glu Asn Trp Asn Leu Gly 
355 360 365 

caa ggt cag att tat tgc cag ggg 1128 
Gin Gly Gin lie Tyr Cys Gin Gly 
370 375 

<210> 40 
<211> 1120 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1120) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 40 

cct cag ate act ctt tgg caa cga ccc etc gtt gca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Ala lie Lys lie Gly 

15 10 15 

gga cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg agt ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Ser Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga car tat gat cag ata ccm rta 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Xaa Xaa 
50 55 60 

gaa att tgc gga cat aaa get gta ggt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly His Lys Ala Val Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag mtt ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin Xaa Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gtg aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 



gat gtg ggt gat gca tat ttt tea gtt cct tta gat gaa gac ttc agr 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Xaa 

oin ->ii- ««« 



210 215 220 



672 



aag tat act gca ttt acc ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tec aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Ser Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate eta gaa cct ttt agg aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
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260 265 270 

caa aat cca gat ata gtt ate tat caa tac atg gat gat eta tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gaa cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg ggg ttt acc acc cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac nat aca aaa gtt agt ggg gaa aat tga att ggg sea agt 1104 
Val Asn Asp Xaa Thr Lys Val Ser Gly Glu Asn * lie Gly Xaa Ser 
355 360 365 

cag att tat tgg agg g 112 0 

Gin lie Tyr Trp Arg 

370 

<210> 41 
<211> 1059 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1059) 

<223> Portion of HIV Reverse Transcriptase 
<400> 41 

cct caa ate act ctt tgg cag cga ccc gtt gtc aca ata aac ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Val Val Thr lie Asn lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gac aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa act ata ggt aca gta tta ata gga cct aca 24 0 

Glu lie Cys Gly His Lys Thr lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggc tgc act 2 88 
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Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu lie Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aac ccg tac aat act cca gtc ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gat agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aac aag aaa act caa gac ttc tgg gaa att caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu lie Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttc tea gtt cct tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt aca aac aat gag acg cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gec ata nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 816 
Ala lie Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
260 265 270 

nnn nnn nnn nnn nnn nnn nnn tat caa tac atg gat gat ttg tat gta 864 
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gag cag cat aga aca aaa ata gag aaa ctg 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Lys Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc aca cca gat aaa aaa cat 96 0 

Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gta ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc 1059 
Val 



<210> 42 

<211> 1053 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1053) 

<223> Portion of HIV Reverse Transcriptase 
<400> 42 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata arg ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Xaa He Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
35 40 45 

gga att gga ggt ttt atm aaa gta aga cag tat gat cag ata eye ata 192 
Gly He Gly Gly Phe Xaa Lys Val Arg Gin Tyr Asp Gin He Xaa He 
50 55 60 

gaa ate tgt gga yat aaa get ata ggt acr gta tta gta gga ccc acg 240 
Glu He Cys Gly Xaa Lys Ala He Gly Xaa Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac rta att gga aga aat ctg wtg act cag att ggt tgc act 2 88 

Pro Val Asn Xaa He Gly Arg Asn Leu Xaa Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa ttr gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Xaa Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttc tgg gaa gtc caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 
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ata cca cat ccc gca ggg tta aag aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt gta aac aat gag aca cca kgg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Val Asn Asn Glu Thr Pro Xaa 
225 230 235 240 

att aga tay cag tac aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata tty caa tgt age atg aca aaa ate tta gag cct ttt aga aag 816 
Ala He Phe Gin Cys Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac eta gtt att tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Leu Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg ara tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Xaa Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg gca gtg caa cct ata gtg ctg cca gaa aaa gac age tgg 1053 
Lys Trp Ala Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp 
340 345 350 

<210> 43 
<211> 1082 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1082) 

<223> Portion of HIV Reverse Transcriptase 
<400> 43 

cct caa ate act ctt tgg caa cga ccc ctt gtc aca rta aag rta ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr Xaa Lys Xaa Gly 
15 io 15 

ggg caa eta aag gaa get yta ttr gat aca gga gca gat gat aca gta 9 6 

Gly Gin Leu Lys Glu Ala Xaa Xaa Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat tta cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
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35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa aty tgt ggg cat aaa get ata ggt aca gta tta gta ggg cct aca 24 0 

Glu Xaa Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ttg ttg act cag att ggt tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc ccc aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aaa gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aag gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttt tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata ccg cat ccc gca ggg tta aaa aag aaa aag tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aaa tat ast gca ttt ace ata ccg agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Xaa Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt ccg cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa tgt age atg aca aaa ate tta gaa cct ttt aga aaa 816 
Ala lie Phe Gin Cys Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac ttg gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga cag cat ctg ttg aaa tgg ggr ttt acc aca cca gac aag aaa cat 960 
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Arg Gin His Leu Leu Lys Trp Xaa Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggg tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta caa ccg ata gag ctg cca gaa aaa gaa age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Glu Leu Pro Glu Lys Glu Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gg 1082 
Val Asn Asp lie Gin Lys Leu Val 
355 360 



<210> 44 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 44 

cct cag ate act ctt tgg caa cga ccc ate gtc aca gta aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr Val Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat tta cca gga aaa tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ttt gec aaa gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe Ala Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tka gga cat aaa gtt ata ggt aca gtc tta gta gga cct aca 24 0 

Glu lie Xaa Gly His Lys Val lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga aat ctg ttg act cag att ggt tgc act 288 
Pro Ala Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 



cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 
Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 



384 



432 
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aaa att tea aag att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa aac agy act wga tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Xaa Thr Xaa Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa ttr gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Xaa Gly 
180 185 190 

ata cca cat ccc tea ggg tta aaa aag aam aaa tea gta aca gta ctg 624 
lie Pro His Pro Ser Gly Leu Lys Lys Xaa Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aaa tat act gca ttt ace ata cct agt rta aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Xaa Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 76 8 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aga ate eta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Arg lie Leu Glu Pro Phe Arg Lys 
260 265 270 

cag aat cca gac ata gtt ate tat caa tac gtg gat gac ttg ctt gta 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Leu Val 
275 280 285 

gga tct gat tta gaa ata ggg caa cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Ala Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg ggg ttt ate aca cca gac gaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe He Thr Pro Asp Glu Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 08 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag ccc ata gtg ctg cca gaa aaa gay age tgg act 10 56 

Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata caa aag tta gtg gga aaa ttg aat tgg gca age cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
He Tyr Ala Gly 
370 



<210> 45 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
<220> 
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<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<22 3> Portion of HIV Reverse Transcriptase 
<400> 45 

cct cag ate act ctt tgg caa cga ccc rtc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gac gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat tta cca gga aaa tgg aaa cca aaa atg ata gtg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Val 
35 40 45 

gga att gga gga ttt gtc aaa gta aaa cag tat gag caa ata cct gta 192 
Gly He Gly Gly Phe Val Lys Val Lys Gin Tyr Glu Gin He Pro Val 
50 55 60 

gaa ate tgt gga cat aaa get gta ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala Val Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Ala Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca aaa gar 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Lys Glu 
115 120 125 

aaa ata maa gca ttg gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys He Xaa Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gtg ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

get ata aag aaa aag aac agt gat aga tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asn Ser Asp Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag agg act caa gac ttc tgg gaa att caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu He Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aag aaa tea gta aca rta eta 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Xaa Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea rtt ccc tta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Xaa Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
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225 230 235 240 

att aga tat caa tac aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
- 245 250 255 

gca ata ttc caa get age atg aca aaa ate tta gag cct ttc aga aaa 816 
Ala He Phe Gin Ala Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa eta gtt ate tat caa tac gtg gat gac ttg tat gta 864 
Gin Asn Pro Glu Leu Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gga cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga gaa cat ctg tta aaa tgg gga tta ttc aca cca gac cag aaa cat 960 
Arg Glu His Leu Leu Lys Trp Gly Leu Phe Thr Pro Asp Gin Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg act ata cag cct atg gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr He Gin Pro Met Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac eta cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp Leu Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 46 
<211> 1116 
<212> DNA 

<213> Human Imimmodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 46 

cct caa ate act ctt tgg caa cga ccc etc gtc aca ata aaa gta ggg 4 8 

Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys Val Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga agg tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata tec ata 192 
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Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin He Ser He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gac ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gag att tgt aca gaa atg gaa aag gaa gga 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aac cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aag tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aaa aga act caa gac ttc tgg gag gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta eta 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggc gat gca tat ttc tea gtt ccc tta gat gaa gac ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aaa tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

act aga tat cag tac aat gtg etc cca cag gga tgg aaa gga tea cca 768 
Thr Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa tgt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Cys Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac eta gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Leu Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gga cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc acc cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 
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cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtr cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Xaa Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 



<210> 47 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 47 

cct caa ate act ctt tgg caa cga ccc etc gtc aca gta aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr Val Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg tgt ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Met Cys Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga caa tat gat cag gta gee atg 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Val Ala Met 
50 55 60 

gaa ate tgt gga cat aag get ata ggt aca gta tta ata gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att age cct att gaa act gta ccm gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Xaa Val Lys Leu Lys 
100 105 110 

cca ggr atg gat ggt cca agg gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Xaa Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata ara gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys He Xaa Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 
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aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec at a aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttt 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac tty tgg gaa gtt caa tta ggr 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Xaa 
180 185 190 

ata ccg cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctt 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg gga gat gca tat ttt tea gtt ccc tta gat aaa gat ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat caa tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa att tta gag cct ttt aga aag 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gga cag cat aga aca aaa ata gag gaa ctr 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Xaa 
290 295 300 

aga caa cat ctg ttg aag tgg ggg ytt acc aca cca gac aag aaa cat 96 0 

Arg Gin His Leu Leu Lys Trp Gly Xaa Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccy cca ttc ctt tgg atg ggk tat gaa etc cat cct gat 100 8 

Gin Lys Glu Xaa Pro Phe Leu Trp Met Xaa Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aar ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 48 
<211> 1115 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1115) 

<223> Portion of HIV Reverse Transcriptase 
<400> 48 

cct cag ate act ctt tgg caa cga ccc etc gtc aca gta aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr Val Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

ata gaa gac ata gaa ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
lie Glu Asp lie Glu Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aaa cag tat gag cag gta ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Glu Gin Val Pro lie 
50 55 60 

gaa etc tgt ggg cgt aaa act ata ggt aca gta tta gta gga cct aca 24 0 

Glu Leu Cys Gly Arg Lys Thr lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aac ctg atg act cag att ggt tgc act 28 8 

Pro Val Asn lie lie Gly Arg Asn Leu Met Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu lie Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aac act cca gta ttt 48 0 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gey ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 528 
Xaa lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aag aaa tea gta aca gta ttg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccg tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt ata aac aat gag aca cca ggg 72 0 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctk cca cag gga tgg aag gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Xaa Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate ttg gag ccc ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac eta gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Leu Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

ggc tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aag tgg gga ttt acc aca cca gat aaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tec car ga 1115 
lie Ser Gin 
370 



<210> 49 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) ... (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 49 

cct cag ate act ctt tgg caa cga ccc etc gtc rca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Xaa lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aag atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 
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gga att gga ggt ttc ate aaa gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt ggc cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat eta ttg act cag att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aag tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 384 
Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa ate tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aag gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aam aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Xaa Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat acc gca ttt cca tec eta gtt ata aac aat gag aca cca gga 72 0 

Lys Tyr Thr Ala Phe Pro Ser Leu Val lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

ate aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 



gca ata ttt caa agt age atg aca aaa ate tta gag cct ttt aga aaa 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 
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caa aat cca gac ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa gta gag gag ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys Val Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 
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cag aaa gag cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca age cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 50 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 50 

cct cag ate act ctt tgg caa cga ccc ttc gtc aac ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Phe Val Asn lie Lys lie Gly 
1 5 10 15 

gga caa ctg aag gaa get eta ttg gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa ttg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Leu lie Gly 
35 40 45 

gga att gga ggt ttk gtc aaa gta aga cag tat gat cag ata cct gta 192 
Gly lie Gly Gly Xaa Val Lys Val Arg Gin Tyr Asp Gin lie Pro Val 
50 55 60 

gaa att tgt gga cat aaa gyt ata ggt aca gtc tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Xaa lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg ttg act cag att ggc tgc act 2 88 

Pro Ala Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc ccg aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
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130 135 140 

aaa att tea aag att ggg cct gaa aat cca tac aat act cca ata ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro lie Phe 

145 150 155 160 

gec ata aag aaa aag aac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta mam aag aac aaa tea gta aca gtg eta 624 

lie Pro His Pro Ala Gly Leu Xaa Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta tat gaa gac ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Tyr Glu Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 

225 230 235 240 

att aga tat cag tay aat gtg ctt cca cag gga tgg aaa gga tea cca 76 8 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc cag agt age atg aca aga ate tta gag cct ttt aga aaa 816 

Ala lie Phe Gin Ser Ser Met Thr Arg lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtc ate tat caa tac gtg gat gat ttg tat gta 864 

Gin Asn Pro Glu lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gca tct gac tta gaa ata gag aaa cat aga aca aaa ata gag gaa ctg 912 

Ala Ser Asp Leu Glu lie Glu Lys His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt tac aca cca gac aaa aag cat 960 

Arg Gin His Leu Leu Arg Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 

305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aaa tta gtg gga aaa ttg aat tgg gca agt cag 1104 

Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gga ggg 1116 
lie Tyr Gly Gly 
370 



<210> 51 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 51 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ttt aty aaa gta aga cag tat gat cag ata cct ata 192 
Gly lie Gly Gly Phe Xaa Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa act ata ggt aca gta tta ata gga cct aca 240 
Glu lie Cys Gly His Lys Thr lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga gat ctg ttg act cag att ggt tgc act 2 88 

Pro Ala Asn lie lie Gly Arg Asp Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa ttg aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggt cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ctg gaa aag gat ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Asp Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cct gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa aac agt act aaa tgg aga aaa tta gta gat ttc 528 
Ala lie Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gcg ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea att ccc tta gat gaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser lie Pro Leu Asp Glu Glu Phe Arg 
210 215 220 

aag tat act gca ttc ace ata cct agt ata aac aat gag aca cca ggg 72 0 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

gtt aga tat caa tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
Val Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag ccc ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tat gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg agg tgg ggg ttt tac aca cca gac aaa aaa cat 960 
Arg Gin His Leu Trp Arg Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta caa cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aaa tta gtg ggg aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 



<210> 52 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 52 

cct caa ate act ctt tgg caa cga ccc ctt gtc aca ata aag rta ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys Xaa Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atr ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Xaa lie Gly 
35 40 45 
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gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ycc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Xaa lie 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt tea gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala He Gly Ser Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata aty gga aga aat ctg atg act cag att ggt tgc act 288 
Pro Val Asn He Xaa Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa ack gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Xaa Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aag caa tgg cca ttg aca gra gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Xaa Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gag atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aga att ggg ccc gaa aat cca tac aat act cca ata ttt 480 
Lys He Ser Arg He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 160 

gec ata aag aaa aag aat agt act aga tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asn Ser Thr Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

a 99 gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aac aaa tea gtg aca gta ytg 624 
He Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Xaa 
195 200 205 



gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gac ttc agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 



672 



aag tat act gca ttt acc ata cct agt atr aac aat gag aaa cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Xaa Asn Asn Glu Lys Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca car gga tgg aaa ggg tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa tgt age atg aca aaa aty tta gag cct ttt aga aaa 816 
Ala He Phe Gin Cys Ser Met Thr Lys Xaa Leu Glu Pro Phe Arg Lys 
260 265 270 

car aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gaa cag cat aga aca aaa ata gag gaa ttg 912 
Gly Ser Asp Leu Glu He Glu Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg tta agg tgg gga ttt ttc aca cca gaa caa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Phe Thr Pro Glu Gin Lys His 
305 310 315 320 
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cag aaa gaa ccg cca ttc ctt tgg atg ggt tat gaa eta cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg acg gta cag cct ata aag ctg cca gaa aaa gat age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Lys Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tay gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 53 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 53 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aaa gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat tta cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gtg aga cag tat gat cag rta ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Xaa Pro lie 
50 55 60 

gaa att tgt gga cat aaa get ata ggt aca gta tta gta gga tct aca 24 0 

Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Ser Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggg tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gag atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
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130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ate cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc egg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt aca aac aat gag aca cca gga 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt agg aat 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Asn 
260 265 270 

aaa aat cca gac ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Lys Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac eta gaa ata ggg cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Glu Leu 
290 295 300 

aga gaa cat ctg ttg aag tgg ggg ttt act aca cca gac aaa aaa cat 960 
Arg Glu His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtc cag cct ata gag ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Glu Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca gga 1116 
He Tyr Ala Gly 
370 

<210> 54 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 54 

cct cag ate act ctt tgg caa cga ccc aty gtc aca ata aag ata ggg 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys lie Gly 
15 10 15 



48 



ggg caa eta aag gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg gat ttg cca gga aga tgg aaa cca aaa atg ata gtg 144 
Leu Glu Asp Met Asp Leu Pro Gly Arg Trp Lys Pro Lys Met lie Val 
35 40 45 

gga att gga ggt ttt gtc aaa gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa att ata ggt aca gta tta ata gga aat aca 240 
Glu lie Cys Gly His Lys lie lie Gly Thr Val Leu lie Gly Asn Thr 
65 70 75 80 

cct gec aac gta att gga aga aat ctg ttg act cag ctt ggt tgc act 28 8 

Pro Ala Asn Val lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ctg gaa aag gat ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Asp Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aag gac agt act aaa tgg aga aaa gta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Val Val Asp Phe 
165 170 175 

aga gaa ctt aac aag aga act caa gac ttc tgg gag gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cac ccc gca ggg ata aaa aag aat aaa tea gta act gta eta 624 
He Pro His Pro Ala Gly He Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gta ggt gat gca tat ttc tea gtt ccc tta gat gaa gac ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aaa tat act gca ttc ace ata cct agt att aac aat gag aca cca ggg 72 0 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg etc cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cac aga ata aaa ata rag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg He Lys He Xaa Glu Leu 
290 295 300 

aga gaa cat eta tgg aag tgg gga ttt tac aca cca gac aaa aag cat 960 
Arg Glu His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata acg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Thr Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg ggg aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
He Tyr Ala Gly 
370 



<210> 55 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 55 

cct caa ate act ctt tgg caa cga ccc etc gtc gca ata aag ata ggg 4 8 

Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Ala He Lys He Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gtc 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
35 40 45 



-88- 



gga att gga ggt ttt ate aaa gta aag cag tat gat cag gta ctt gta 192 

Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Asp Gin Val Leu Val 
50 55 60 

gaa att tgt gga cat ara get ata ggt aca gta tta gta gga cct aca 240 

Glu lie Cys Gly His Xaa Ala lie Gly Thr Val Leu Val Gly Pro Thr 

65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgt act 28 8 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca ggt atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt aca gaa atg gaa aag gaa ggg 432 

Lys lie Lys Ala Leu lie Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 48 0 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 

145 150 155 160 

gec ata aag aaa aaa gac agt acc aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa acg caa gac ttc tgg gaa gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 

lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aag gac ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt gta aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Val Asn Asn Glu Thr Pro Gly 

225 230 235 240 

att aga tat cag tac aat gtg ctg cca cag gga tgg aaa gga tea cca 768 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttt caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 

Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac atg gtt ate tat caa tac atg gat gat ttg tat gta 864 

Gin Asn Pro Asp Met Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gag cag cat aga rca aaa ata gag gaa ctg 912 

Gly Ser Asp Leu Glu lie Glu Gin His Arg Xaa Lys lie Glu Glu Leu 
290 295 300 

agg cag cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aag cat 960 

Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 

305 310 *315 320 

-89- 



cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata ktg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Xaa Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tam ccc ngg 1116 
lie Xaa Pro Xaa 
370 

<210> 56 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 56 

cct caa ate act ctt tgg caa cga ccc att gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ace ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Thr lie 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aaa gaa ggg 4 32 

Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
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130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gat agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 

165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gta caa tta gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta eta 624 

lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttc acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 

245 250 255 

gca ata ttc caa age age atg aca aaa att tta gaa cct ttt aga aaa 816 

Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tat caa tac atg gat gat ttg tat gta 864 

Gin Asn Pro Glu He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta raa ata gag cag cat aga aca aaa ata gag gaa ctg 912 

Gly Ser Asp Leu Xaa He Glu Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aag cat 960 

Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 

325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa cag gac age tgg act 10 5 6 

Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Gin Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 

Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
He Tyr Pro Gly 
370 



<210> 57 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (297) 

<22 3> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 57 

cct cag ate act ctt tgg caa cga ccc etc gtc aca gta aag tta ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr Val Lys Leu Gly 
15 10 15 

ggg caa eta atg gaa gtt eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Met Glu Val Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

rta gaa gaa ata agt tta cca gga aga tgg aaa cca aaa atg ata ggg 144 
Xaa Glu Glu lie Ser Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt gtc aaa gta aaa cag tat gat cag gta ccc tta 192 
Gly lie Gly Gly Phe Val Lys Val Lys Gin Tyr Asp Gin Val Pro Leu 
50 55 60 

gaa att tgt gga aaa aag get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly Lys Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ttt ttg get cag att ggt tgc act 288 
Pro Ala Asn lie lie Gly Arg Asn Phe Leu Ala Gin lie Gly Cys Thr 
85 90 95 

tta aat ttc ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aag aac agt act aga tgg aga aaa tta gta gat ttt 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag agg acs caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Xaa Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aar aag aac aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat cca gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Pro Asp Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt aca aac aat gag aca cca ggg 72 0 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cca ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tgt caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp He Val He Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gag cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Glu Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt tac aca cca gac caa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Tyr Thr Pro Asp Gin Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata acg ctg cca gac aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Thr Leu Pro Asp Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
He Tyr Ala Gly 
370 



<210> 58 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 58 

cct caa ate act ctt tgg caa cga ccc eta gtt aca ata aaa ata ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys He Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg act ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Thr Leu Pro Gly Lys Trp Lys Pro Lys Met He Gly 
35 40 45 
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gga att gga ggt ttt ate aaa gta aga car tat gat cag ata etc ata 192 

Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu lie 
50 55 60 

gaa att tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 

Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ate ggt tgc act 288 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gag act gta cca gta aaa tta aag 33 6 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 

100 105 110 

cca gga atg gat ggc cca aga gtt aar caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 

115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gag aag gaa ggg 432 

Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 

180 185 190 

ata cca cat cca gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 

lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 

195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca ata ate tta gag cct ttt aga aaa 816 

Ala lie Phe Gin Ser Ser Met Thr lie He Leu Glu Pro Phe Arg Lys 

260 265 270 

caa aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 

Gin Asn Pro Asp He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 

275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 

Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga cag cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aaa cat 96 0 

Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 
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cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cca gat 1003 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata aag ctg cca gac aaa gac age tgg act 10 56 

Lys Trp Thr Val Gin Pro lie Lys Leu Pro Asp Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca gga 1116 
He Tyr Ala Gly 
370 



<210> 59 

<211> 1116 

<212> DKA 

<213> Human Immunod i ficiency Virus (HIV) 



<220> 
<221> CDS 
<222> (0) . . 



(297) 



<223> HIV Protease 



<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 59 

cct caa ate act ctt tgg caa cga ccc tta gtc aca ata aag ata grg 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Xaa 
15 10 15 



48 



ggg caa eta aaa gaa get eta tta gat aca gga gca gat gat aca gta 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 



96 



tta gaa gaa ata aat ttg cca ggg aaa tgg aaa cca maa atg ata ggg 
Leu Glu Glu He Asn Leu Pro Gly Lys Trp Lys Pro Xaa Met lie Gly 
35 40 45 



144 



gga att gga ggt ttt att aaa gta aga cag tat gat caa ata gee ata 
Gly lie Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Ala He 
50 55 60 



192 



gaa att tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 



240 



cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 



288 



tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 



336 



cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 
Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 



384 



aaa ata aaa gca tta rta gaa ate tgt aca gaa atg gaa aag gaa ggg 
Lys He Lys Ala Leu Xaa Glu He Cys Thr Glu Met Glu Lys Glu Gly 



432 
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130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gem ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Xaa lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtc caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta eta 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttc tea gtt ccc tta gac caa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Gin Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca agg ate tta gar cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Arg lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtc aty tat cag tac atg gat gat tta tat gta 864 
Gin Asn Pro Glu lie Val Xaa Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa gta gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys Val Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agr tgg ggg ttt tmc acg cca gac aaa aag cat 960 
Arg Gin His Leu Leu Xaa Trp Gly Phe Xaa Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag act ata gaa ctg cca gaa aaa gat age tgg act 105 6 

Lys Trp Thr Val Gin Thr lie Glu Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

ata tac cca ggg 1116 
lie Tyr Pro Gly 
370 



<210> 60 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 60 

cct caa ate act ctt tgg cag cga ccc cty gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys lie Gly 
1 5 10 15 

ggg caa eta aaa gaa get eta tta gay aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca ggr aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Xaa Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata cct rta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro Xaa 
50 55 60 



gaa att tgt gga cat aaa get ata ggt aca gta tta ata gga cct aca 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 



240 



cct gtc aac ata att gga aga aat ctg atg act cag ctt ggc tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Met Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 no 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gag 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt aat aga tgg aga aaa tta gtg gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Asn Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aar aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta raa aag aac aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Xaa Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt ace aat aat gag aca ccm ggg 72 0 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Xaa Gly 
225 230 235 240 

gtt aga tat cag tat aat gta ctt ccc cag gga tgg aaa gga tea cca 768 
Val Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca tat tty caa tgt agy atg aca aaa ate tta aag cct ttc agg aaa 816 
Ala Tyr Phe Gin Cys Xaa Met Thr Lys lie Leu Lys Pro Phe Arg Lys 
260 265 270 

caa aat cca cac ata gtt att ttt caa tat gtg gat gac ttg tat gta 864 
Gin Asn Pro His lie Val lie Phe Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gca tct gac tta gaa ata gag cag cat aga aca aaa ata gag gaa ctg 912 
Ala Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ttg ttg agg tgg gga etc ace aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Leu Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

caa aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag ccc ata acg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Thr Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 61 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 61 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aaa gat agg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys Asp Arg 
15 10 15 

999 gca agt aaa gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Ala Ser Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa ata aat ttg cca ggg rag tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu lie Asn Leu Pro Gly Xaa Trp Lys Pro Lys Met lie Gly 
35 40 45 



-98- 



gga att gga ggt ttt ate aaa gta aga cag tmt gat cag ata ccc gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Xaa Asp Gin lie Pro Val 
50 55 60 

gaa att tgt gga cat aag get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag mtt ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin Xaa Gly Cys Thr 
85 90 95 

tta aat ttt ccc ate agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca tta aca gag gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 43 2 

Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca ata ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 160 

gec ata aag aaa aag gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt cag tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa age ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Ser Phe Arg 
210 215 220 

aag tac act gca ttt acc ata ccc agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

rca aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
Xaa Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa atg gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Glu Met Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gag ata gag caa cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Glu Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 
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cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 62 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 62 

cct caa ate act ctt tgg caa cga ccc ate gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 9 6 

Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga caa tat gat cag ata gee ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Ala He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta ata gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg atg act cag att ggc tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gat act gta cca gta aaa tta aag 33 6 

Leu Asn Phe Pro He Ser Pro He Asp Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
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130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aag aat agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg eta aaa aag aay aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtc ccc tta gat gaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tac act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

rtt aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 76 8 

Xaa Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

tea ata ttc caa tgt age atg acg aaa ate tta gag cct ttt aga aaa 816 
Ser lie Phe Gin Cys Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

cag aat cca gac ata gtt ate trt caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Xaa Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gca tct gac tta gaa ata gag cag cat aga ata aaa ata gag gaa eta 912 
Ala Ser Asp Leu Glu lie Glu Gin His Arg lie Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aag tgg gga ttt tac aca cca gac aaa aaa yat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys Xaa 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gar etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag ttr gtg gga aaa ctg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Xaa Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 63 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 63 

cct caa ate act ctt tgg caa cga ccc gtt gtt aca gta agg ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Val Val Thr Val Arg lie Gly 
15 10 15 

gga cag eta acg gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Thr Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg act ttg cca gga aaa tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Thr Leu Pro Gly Lys Trp Lys Pro Lys lie lie Gly 
35 40 45 

ggr att gga ggt ttt ate aaa gta aga cag tat gat cac gta ctt gta 192 
Xaa lie Gly Gly Phe He Lys Val Arg Gin Tyr Asp His Val Leu Val 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta ata gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ttg atg act cag ctt ggg ttc act 288 
Pro Val Asn He He Gly Arg Asn Leu Met Thr Gin Leu Gly Phe Thr 
85 90 95 

tta aat ttt cca att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 no 

cca ggg atg gat ggc cca aaa gtt aaa caa tgg cca ttg mca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Xaa Glu Glu 
115 120 125 

aaa ata aaa gca eta aca gaa att tgt aca gaa ttg gaa aag gaa gga 432 
Lys He Lys Ala Leu Thr Glu He Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aga ata ggg cct gaa aat cca tac aat act cca ata ttt 480 
Lys He Ser Arg He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 160 

gee ata aag aag aaa aac ggt ayt agg tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asn Gly Xaa Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gag eta aat aag aga act caa gac ttc tgg gaa gtt caa eta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca gga eta aaa aag aac aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta cat gaa gac ttt aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu His Glu Asp Phe Arg 
210 215 220 

aag tat acc gca ttc ace ata cct agt aca aac aat gaa aca cca gga 72 0 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 



att aga tat cag tac aat gtg ctt cca caa gga tgg aaa gga tea ccg 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 



768 



gca ata ttc caa agt age atg acc aaa ate tta gaa cct ttt aga aaa 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 



816 



caa aat cca gaa atg gtt ate tat caa tac gtg gat gat ttg tat gta 
Gin Asn Pro Glu Met Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 



864 



gga tct gac tta gaa ata ggg cag cat aga ata aaa ata gag gaa tta 
Gly Ser Asp Leu Glu He Gly Gin His Arg lie Lys lie Glu Glu Leu 
290 295 300 



912 



agg gaa cac eta ttg aag tgg gga ttt ttc acc cca gac gaa aag cat 
Arg Glu His Leu Leu Lys Trp Gly Phe Phe Thr Pro Asp Glu Lys His 
305 310 315 320 



960 



cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa ctt cat cct gat 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 



1008 



aaa tgg aca gtg cag cct ata aaa ctg cca gaa aaa gaa age tgg act 
Lys Trp Thr Val Gin Pro He Lys Leu Pro Glu Lys Glu Ser Trp Thr 
340 345 350 



1056 



gtc aat gat ata cag aag tta gtg gga aaa tta aat tgg gca age cag 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 



1104 



att tat cca gga 
lie Tyr Pro Gly 
370 



1116 



<210> 64 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) ... (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 64 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat tta cca gga aaa tgg aaa cca aaa atr ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Xaa He Gly 
35 40 45 
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gga att gga ggy ttt rtc aaa gta aga cag tat gat cag ata syc ata 192 
Gly lie Gly Xaa Phe Xaa Lys Val Arg Gin Tyr Asp Gin lie Xaa lie 
50 55 60 

gaa ate tgc gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gyc aac ata att gga aga aat ctg ttg act cag ctt ggg tgc act 2 88 

Pro Xaa Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta caa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Gin Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca tta aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aag ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

get ata aag aaa aag gac agt get aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Ala Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

agg gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta ggg 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cck cat ccc gca ggg ttr aaa aag aaa aaa tea gta aca gta eta 624 
He Xaa His Pro Ala Gly Xaa Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gta ggt gat gca tat ttt tea gtt ccc tta gat caa aac ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Gin Asn Phe Arg 
210 215 220 

aag tat act gca ttc ace ata cct agt ata aac aat gag ayg cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Xaa Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gar ata rtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Glu He Xaa He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac ttr gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Xaa Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ytg ttg aag tgg gga ttt acc aca cca gac aag aag cat 960 
Arg Gin His Xaa Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 
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cag aaa gaa cct cca ttc ctt tgg atg ggg tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata atg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Met Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca gga 1116 
lie Tyr Ala Gly 
370 



<210> 65 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<2 23> Portion of HIV Reverse Transcriptase 
<400> 65 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ate aat ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp lie Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt gtc aaa gta aga gag tat gat cag gta ccc ata 192 
Gly lie Gly Gly Phe Val Lys Val Arg Glu Tyr Asp Gin Val Pro lie 
50 55 60 

gac ate tgt gga cat aaa gtt ata ggt aca gtg tta gta gga cct aca 240 
Asp lie Cys Gly His Lys Val lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 288 
Pro Ala Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gar ate tgt aca gaa ttg gaa aag gaa gga 4 32 

Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
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130 135 140 

aaa att tea aaa att ggg cct gaa aay cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

get ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctr 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Xaa 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tac act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

rtt aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
Xaa Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa att tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata att ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp He He He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gat ttg gaa ata gag cag cat aga aca aaa ata gag gaa eta 912 
Gly Ser Asp Leu Glu He Glu Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga gaa cat ctg tgg aag tgg gga ttt tac aca cca gac aaa aaa cat 960 
Arg Glu His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aag gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata aag ytg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Lys Xaa Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt caa 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
He Tyr Pro Gly 
370 



<210> 66 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 



-106- 



<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<22 3> Portion of HIV Reverse Transcriptase 
<400> 66 

cct cag ate act ctt tgg caa cga ccc ctt gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gak rca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Xaa Xaa Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta agr car tat gac cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Xaa Gin Tyr Asp Gin lie Pro lie 



50 55 60 

gaa ate tgt gga cag aaa get ata ggt aca gta tta gta gga cct acm 
Glu lie Cys Gly Gin Lys Ala lie Gly Thr Val Leu Val Gly Pro Xaa 
65 70 75 80 



240 



cct gtc aac ata att gga aga aat ctg ttg act caa att ggt tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gca gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Ala Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt aat ara tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Asn Xaa Trp Arg Lys Leu Val Asp Phe 
165 170 175 

agg gaa etc aat aag aga act caa gac ttc tgg gaa gtt caa tta ggc 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aam aaa tea gta aca rta ctr 624 
He Pro His Pro Ala Gly Leu Lys Lys Xaa Lys Ser Val Thr Xaa Xaa 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aar tat act gca ttt acc ata cct agt aca wac aat gag aca cca ggg 72 0 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Xaa Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag krc aat gtg yyt cca cag gga tgg aaa gga tern cca 768 
lie Arg Tyr Gin Xaa Asn Val Xaa Pro Gin Gly Trp Lys Gly Xaa Pro 
245 250 255 

gca ata ttc mam agt age ayg aca aaa att tta gag cct ttt aga aaa 816 
Ala lie Phe Xaa Ser Ser Xaa Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tgt caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Cys Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ttg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

agg caa cat ttg ttg agg tgg ggr ttt acc aca cca gac ara aaa cat 960 
Arg Gin His Leu Leu Arg Trp Xaa Phe Thr Thr Pro Asp Xaa Lys His 
305 310 315 320 

cag aaa gag cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata aaa ctg cca gaa aaa gay age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Lys Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 67 
<211> 1119 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
<400> 67 

cct caa ate act ctt tgg caa cga cca ata gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

eta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 
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gga att gga ggt ttt aty aaa gta aga cag tat gat cag ata tec ata 192 
Gly lie Gly Gly Phe Xaa Lys Val Arg Gin Tyr Asp Gin lie Ser lie 
50 55 60 

gaa ate tgt ggg cat aaa gtt aca ggt aca gtg tta ata gga cct aca 240 
Glu lie Cys Gly His Lys Val Thr Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca ttg gta gaa att tgt gca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Ala Glu Met Glu Lys Glu Gly 
130 135 140 

caa att tea aaa att gag cct gaa aat cca tac aat aat cca gta ttt 480 
Gin lie Ser Lys lie Glu Pro Glu Asn Pro Tyr Asn Asn Pro Val Phe 
145 150 155 160 

gtc ata aag aaa aaa gac ggt act aac tgg aga aaa tta ata gat ytc 52 8 

Val lie Lys Lys Lys Asp Gly Thr Asn Trp Arg Lys Leu lie Asp Xaa 
165 170 175 

aga gaa ctt aat aag aga act caa gat ttc tgg gaa att caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu lie Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aat aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca ttt tat tea gtt ccc tta gat gag aac ttc agg 672 
Asp Val Gly Asp Ala Phe Tyr Ser Val Pro Leu Asp Glu Asn Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca atg gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Met Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa att tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

aac aat cca gac ata gtc ate tat caa tac atg gat gat ttg tat gta 864 
Asn Asn Pro Asp He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gca tct gat tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Ala Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga gaa cat eta ttr aag tgg gga ttt acc aca cca gac aar aar yat 960 
Arg Glu His Leu Xaa Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys Xaa 
305 310 315 320 
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cag aaa gaa cct cca ytc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Xaa Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg att 1119 
lie Tyr Pro Gly lie 
370 

<210> 68 
<211> 1119 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<22 3> Portion of HIV Reverse Transcriptase 
<400> 68 

cct caa ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

gga caa eta aaa gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca ggg aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga ate gga gga ttt ate aaa gta aga cag tat gag cag ata cac ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Glu Gin lie His lie 
50 55 60 

gaa ate tgt ggg cat aaa get ata ggt aca gtr tta ata gga ccc aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Xaa Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggc tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gag 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 



-110- 



130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gtt ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aaa aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg ttg aag aag aaa aaa tea gta aca gta eta 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa aac ttt agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asn Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aat aat gaa aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa get age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ala Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac atg rtt att tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Met Xaa lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

ggc tct gac tta gaa ata gga cag cat aga aca aaa ata gaa gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg ggg ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc etc tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gcg agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg att 1119 
He Tyr Pro Gly He 
370 



<210> 69 
<211> 1119 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
<400> 69 

cct cag ate act ctt tgg caa cga ccc cty gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys lie Gly 
15 10 15 

ggg caa yta aag gaa get mta tta gay aca gga gca gat gat aca gtg 96 
Gly Gin Xaa Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga gag tat gag cag ata caa gta 192 
Gly lie Gly Gly Phe He Lys Val Arg Glu Tyr Glu Gin He Gin Val 
50 55 60 

gaa ate tgt gga cat aag get ata rgt aca gta tta ata gga cct aca 24 0 

Glu He Cys Gly His Lys Ala He Xaa Thr Val Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat eta atg act cag att ggt tgc act 2 88 

Pro Val Asn He He Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gag act gta ccg gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 HO 

cca gga atg gat ggt cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gaa gga 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat acy ccr gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Xaa Xaa Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata ccg cat ccc gca ggg tta aag aag aaa aaa tea gta aca gta ctr 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Xaa 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tac act gca ttt ace ata cct agt ata aac aat gag aca cca gga 72 0 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 



att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 
lie Arg Tyr Gin Tyr Asn Val Leu. Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 



768 



gca ata ttc caa agt age atg aca aaa ate tta gaa cct ttt aga aaa 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu. Glu Pro Phe Arg Lys 
260 265 270 



816 



caa aat cca gac ata gtt ate tat car tac atg gat gac ttg tat gta 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 



864 



gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa eta 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 



912 



aga caa cat ctg tkg agg tgg gga ttt tac aca cca gac aaa aaa cat 
Arg Gin His Leu Xaa Arg Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 



960 



cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cac cct gat 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 



1008 



aaa tgg aca gta cag cct ata gtg ctr cca gaa aaa gac age tgg act 
Lys Trp Thr Val Gin Pro lie Val Xaa Pro Glu Lys Asp Ser Trp Thr 
340 345 350 



1056 



gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gcg agt cag 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 



1104 



att tat tea ggg att 
lie Tyr Ser Gly lie 
370 



1119 



<210> 70 

<211> 1119 

<212> DNA 

< 2 1 3 > Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<22 3 > HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
<400> 70 

cct caa ate act ctt tgg caa cga ccc cty gtc kca ata aag gta ggr 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Xaa lie Lys Val Xaa 
15 10 15 

ggg caa mta aag gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Xaa Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 
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gga att gga ggt ttt ate aaa gta aaa cag tat gat cag gta arc ata 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Asp Gin Val Xaa lie 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta ata gga cct aca 24 0 

Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aay ctg ttg aca cag att ggt tgy act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca ara gty aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Xaa Xaa Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aar gca tta atg gaa att tgt gca gay atg gaa aag gaa ggr 432 
Lys lie Lys Ala Leu Met Glu lie Cys Ala Asp Met Glu Lys Glu Xaa 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gey ata aag aaa aaa gac age act aaa tgg aga aaa tta gta gat ttc 52 8 

Xaa lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttt tgg gaa gtc caa tta gga 57 6 

Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccy gca ggg tta aaa aag aac aaa tea gta aca gta ttg 624 
lie Pro His Xaa Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccy tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Xaa Leu Asp Lys Asp Phe Arg 
210 215 220 

aaa tay act gca ttt acm ata cct agt ata aat aat gca aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Xaa lie Pro Ser He Asn Asn Ala Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga rar 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Xaa 
260 265 270 

cag aat cca gac ata gtt ate tat caa tac atg gat gay ttg tat gta 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa mta ggg cag cat aga rca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu Xaa Gly Gin His Arg Xaa Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg tta agg tgg ggg ttt ace acw cca gac aag aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Xaa Pro Asp Lys Lys His 
305 310 315 320 
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• 



cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta car ccc ata gtg ttg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tay gsa ggg att 1119 
He Tyr Xaa Gly He 
370 



<210> 71 
<211> 1119 
<212> DNA 

<213> Human Immunodeficiency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
<400> 71 

cct caa ate act ctt tgg caa cga ccc ate gtc tea ata aag ata ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro He Val Ser He Lys He Gly 
15 10 15 

ggg gca aat aaa gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Ala Asn Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aag cca aaa atg ata gtg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Val 
35 40 45 

gga att gga ggt ttt age aaa gta aga caa tat gat cag ata ccc ata 192 
Gly He Gly Gly Phe Ser Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa ate tgc gga cgt aaa gtt gta ggt tea gta tta ata gga cct aca 24 0 

Glu He Cys Gly Arg Lys Val Val Gly Ser Val Leu He Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga aat ctg ttg act cag ctt ggc tgt act 288 
Pro Ala Asn He He Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 33 6 

Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca aaa gag 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Lys Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt aca gaa ttg gaa gaa gma gga 432 
Lys He Lys Ala Leu He Glu He Cys Thr Glu Leu Glu Glu Xaa Gly 
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130 135 140 

aaa att aca aaa att ggg cct gaa aat ccg tac aat act cca ata ttt 4 80 

Lys lie Thr Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro lie Phe 

145 150 155 160 

gcc ata aag aaa aar aac agt act aaa tgg aga aaa tta gta gac ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

agg gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 

lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea att ccc tta gat aaa gac ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser lie Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aar tat act gca ttt acc ata cct agt acg aat aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 

225 230 235 240 

att aga tat cag tat aat gtg ctt cca cag gga tgg aaa gga tea cca 768 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 

Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat ccc gac ata gtt ate tat caa tac gtg gat gat ttg ctt gta 864 

Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Leu Val 
275 280 285 

gga tct gac tta gaa ata gag cag cat aga aca aaa ata gag gag eta 912 

Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg aag tgg gga ttt tac aca cca gac aaa aaa cat 960 

Arg Gin His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 

305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata caa aag tta gtg gga aaa tta aat tgg gew agt cag 1104 

Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Xaa Ser Gin 
355 360 365 

att tat cca ggg att 1119 
lie Tyr Pro Gly lie 
370 



<210> 72 
<211> 1119 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
<400> 72 

cct cag ate act ctt tgg caa cga ccc cty gtc aca ata aag ate ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys lie Gly 
15 10 15 

ggg caa tta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

ata gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
lie Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt rtc aaa gta aga caa tat gat cag gta ccc ata 192 
Gly lie Gly Gly Phe Xaa Lys Val Arg Gin Tyr Asp Gin Val Pro lie 
50 55 60 



gaa att tgc gga cat aaa get ata ggt aca gta tta ata gga cct aca 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 



cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 
Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 



240 



cct gyc aac ata att gga aga aac ctg ttg act caa ctt ggc tgc act 288 
Pro Xaa Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt cca att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 



384 



aaa ata aaa gca tta gta gaa att tgt aca gaa ctg gaa aaa gga agg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Gly Arg 
130 135 140 

aaa aat tac aaa att ggg cct gaa aac cca tac aat act cca gta ttt 480 
Lys Asn Tyr Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttc tea gtt ccc tta gat aag gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 



aag tat act gca ttt acc ata cct age ata aac aat gag aca cca ggg 



720 
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Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg etc cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gem ata ttc caa tgt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Xaa lie Phe Gin Cys Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Glu lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

ggg tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga cga cat ctg ttg aag tgg gga ttt tac aca cca gac aaa aaa cat 960 
Arg Arg His Leu Leu Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gag etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta caa cct ata gtg eta cca gag aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aag tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

ata tac gca ggg att 1119 
lie Tyr Ala Gly He 
370 

<210> 73 
<211> 1119 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . , . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
<400> 73 

cct caa ate act ctt tgg caa cga ccc ttc gtc aca gta aag ata ggg 4 8 

Pro Gin He Thr Leu Trp Gin Arg Pro Phe Val Thr Val Lys He Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat aat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asn Thr Val 
20 25 30 

tta gaa gaa atg aat tta ccg gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
35 40 45 
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gga att gga ggt ttt ate aaa gta aga cag tat gat cag rta ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Xaa Pro lie 
50 55 60 

gaa ate tgt gga cac aaa get ata ggt aca gta tta ata gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga gat ctg ttg act cag ctt ggt tgc act 288 
Pro Val Asn He He Gly Arg Asp Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gat act gta cca gta aaa tta aaa 336 
Leu Asn Phe Pro lie Ser Pro He Asp Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ctg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aag att tea aaa att ggg cct gaa aat cca tac aat acc cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

get ata aag aaa aaa gac agt act aaa tgg aga aag tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

a 9 a 9 aa ctt aat aa 9 a 9 a act caa 9 ac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gcg ggg tta aaa aag aac aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt ccc cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa att tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

cag aat cca gac ata gtt ate tac caa tac gtg gat gac ttg tat gta 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga gca aaa ata gat gag ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Ala Lys He Asp Glu Leu 
290 295 300 

agg caa cat ctg ttg aag tgg gga ttt tac aca cca gac aaa aag cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 
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cag aaa gaa cca cca ttc ctt tgg atg ggk tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Xaa Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aar gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg att 1119 
lie Tyr Pro Gly lie 

370 

<210> 74 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 74 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag gtc ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys Val Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gag gaa eta aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Leu Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aaa cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ata tgt gga cat aaa get att ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aac ttg ttg act cag ctt ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta aca gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Thr Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
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130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 

145 150 155 160 

get ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 

165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggt tta aaa aag aaa aaa tea gta aca gtc ctg 624 

lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tac act gca ttt ace ata cct agt ata aac aat gag aca cca gga 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 

225 230 235 240 

att aga tac cag tac aat gtg ctt ccc cag ggg tgg aaa gga tea cca 768 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 

245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt agg aaa 816 

Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tac caa tac atg gat gat ttg tat gta 864 

Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 

Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga cag cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aag cat 960 

Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 

305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gag etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 

325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gat age tgg act 1056 

Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 

Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
He Tyr Ala Gly 
370 



<210> 75 
<211> 819 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 
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<220> 

<221> CDS 

<222> (0) . . . (819) 

<22 3> Portion of HIV Reverse Transcriptase 
<400> 75 

ccc att agt cct att gam act gta cca gta aaa tta aag cca gga atg 4 8 

Pro lie Ser Pro lie Xaa Thr Val Pro Val Lys Leu Lys Pro Gly Met 
15 10 15 

gat ggc cca aaa gtt aaa caa tgg cca tta aca gag gaa aaa ata aaa 96 
Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu Lys lie Lys 
20 25 30 

gca ttg gta gaa att tgt aca gaa atg gaa aag gaa gga aaa att tea 144 
Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly Lys lie Ser 
35 40 45 

aaa att ggg cct gaa aat cca tac aat act cca gta ttt gec ata aag 192 
Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe Ala lie Lys 
50 55 60 

aaa aag gac agt act aaa tgg aga aaa tta gta gat ttc aga gaa ctt 240 
Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe Arg Glu Leu 
65 70 75 80 

aat aar aga act caa gat ttc tgg gaa gtt caa tta gga ata cca cat 288 
Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly lie Pro His 
85 90 95 

ccc tea ggg tta aaa aag aay aaa tea gta aca gta ttg gat gtg ggt 33 6 

Pro Ser Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu Asp Val Gly 
100 105 110 

gat gca tat ttt tea gtt ccy tta gat aaa gac ttc agg aag tat act 384 
Asp Ala Tyr Phe Ser Val Xaa Leu Asp Lys Asp Phe Arg Lys Tyr Thr 
115 120 125 

gca ttt ace ata cct agt ata aac aat gag aca cca ggg att agr tat 432 
Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly lie Xaa Tyr 
130 135 140 

cag tac aat gtg ctt cca caa gga tgg aaa gga tea cca gca ata ttc 480 
Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro Ala lie Phe 
145 150 155 160 

caa agt age atg aca aaa ate tta gag cct ttt aga aaa cat aat cca 52 8 

Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys His Asn Pro 
165 170 175 

gac ata gtt ate tat caa tac gtg gat gat ttg tat gta gga tct gac 576 
Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val Gly Ser Asp 
180 185 190 

tta gaa ata gag gag cat aga aca aaa ata gag gaa ctg agr vrg cat 624 
Leu Glu lie Glu Glu His Arg Thr Lys lie Glu Glu Leu Xaa Xaa His 
195 200 205 

ctg tta aag tgg gga ttt acy aca cca gac aaa aag cat cag aaa gaa 672 
Leu Leu Lys Trp Gly Phe Xaa Thr Pro Asp Lys Lys His Gin Lys Glu 
210 215 220 

cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat aaa tgg aca 720 
Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr 
225 230 235 240 

gta cag cct ata aag ctg cca gaa aaa gac age tgg act gtc aat gac 768 
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Val Gin Pro lie Lys Leu Pro Glu Lys Asp Ser Trp Thr Val Asn Asp 
245 250 255 

ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag att tat gca 816 
He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin He Tyr Ala 
260 265 270 

ggg 819 
Gly 



<210> 76 
<211> 819 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (819) 

<223> Portion of HIV Reverse Transcriptase 
<400> 76 

ccc att agt cct att gaa act gta cca gta aaa tta aag cca gga atg 48 
Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys Pro Gly Met 
15 10 15 

gat ggc cca aaa gty aaa caa tgg cca tta aca gaa gaa aaa ata aga 96 
Asp Gly Pro Lys Xaa Lys Gin Trp Pro Leu Thr Glu Glu Lys He Arg 
20 25 30 

gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga aaa att tea 144 
Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly Lys He Ser 
35 40 45 

aaa att ggg cct gaa aat cca tac aat act cca gtg ttt get ata aag 192 
Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe Ala He Lys 
50 55 60 

aaa aaa gac agt act aar tgg aga aaa ttg gta gat ttc aga gaa ctt 24 0 

Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe Arg Glu Leu 
65 70 75 80 

aat aag aga act caa gac ttc tgg gaa gtt caa tta gga ata cca cat 2 88 

Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly He Pro His 
85 90 95 

ccc tea ggg tta aaa aag aaa aaa tea gta aca gta ctg gat gtg ggt 33 6 

Pro Ser Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu Asp Val Gly 
100 105 110 

gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg aag tat act 3 84 

Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg Lys Tyr Thr 
115 120 125 

gca ttt act atn cct agt ata aac aat gag aca cca ggg att agg tat 432 
Ala Phe Thr Xaa Pro Ser He Asn Asn Glu Thr Pro Gly He Arg Tyr 
130 135 140 

cag tac aat gtg ctt cca caa gga tgg aaa gga tea cca gca ata ttc 480 
Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro Ala He Phe 
145 150 155 160 

caa agt age atg aca aaa ate tta gag cct ttt aga aaa caa aat cca 52 8 

Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys Gin Asn Pro 
165 170 175 



-123- 



gac ata gtt ate tat caa tac gtg gat gat ttg tat gta gga tct gac 576 
Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val Gly Ser Asp 
180 185 190 

eta gaa ata gga cag cat aga aca aaa ata gag gaa ctg aga cag cat 624 
Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu Arg Gin His 
195 200 205 

ctg ttg agg tgg gga ttt acc aca cca gac aag aaa cat cag aaa gaa 672 
Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His Gin Lys Glu 
210 215 220 

cct ccc ttt ctt tgg atg ggc tat gaa etc cat cct gat aaa tgg aca 720 
Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr 
225 230 235 240 

gta cag cct ata gag ctg cca gac aag gat age tgg act gtc aat gac 768 
Val Gin Pro lie Glu Leu Pro Asp Lys Asp Ser Trp Thr Val Asn Asp 
245 250 255 

ata cag aag tta gtg gga aaa tta aat tgg gca agt cag ata tat gca 816 
lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin lie Tyr Ala 
260 265 270 

ggg 819 

Gly 



<210> 77 

<211> 1116 

<212> DMA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 77 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg aat ttg cca ggg aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata cct ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro He 
50 55 60 

gaa ate tgc gga cat aaa get gta ggt aaa gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala Val Gly Lys Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act caa ctt ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 
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tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aag caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gag aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

get ata aag aaa aaa aac agt act aga tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asn Ser Thr Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga acg caa gac ttc tgg gaa gtt caa nnn nnn 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Xaa Xaa 
180 185 190 

nnn nnn nnn nnn nnn ggg twa aaa aag aaa aaa tea gta aca gta ctg 624 
Xaa Xaa Xaa Xaa Xaa Gly Xaa Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gta ggt gat gca tat ttc tea gtt cct eta gat aaa gac ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tac act gca ttc acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctg cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac gtg gat gat ttg tat gtg 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ttg ttg aag tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca age cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
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355 360 365 

ata tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 78 
<211> 1122 
<212> DNA 

<213> Human Immunodeficiency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1122) 

<223> Portion of HIV Reverse Transcriptase 
<400> 78 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg gat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Met Asp Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aaa cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa gtt ata ggt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly His Lys Val He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 28 8 

Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca ttg gta gaa ata tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat acr cca gta ttt 48 0 

Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Xaa Pro Val Phe 
145 150 155 160 

gec ata arg aaa aaa gaa age tct age tct aaa tgg aga aaa tta gta 52 8 

Ala He Xaa Lys Lys Glu Ser Ser Ser Ser Lys Trp Arg Lys Leu Val 
165 170 175 

gat ttc aga gaa ctt aat aar aga act caa gac ttt ttk gaa gtt caa 576 



-126- 



Asp Phe Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Xaa Glu Val Gin 
180 185 190 

tta gga ata cca cat ccc gca ggg tta aag aag aaa aaa tea gya aca 624 
Leu Gly lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Xaa Thr 
195 200 205 

rta ttg gat gtg ggt gat gca tat ttt tea gtt ccc tta gat raa gac 672 
Xaa Leu Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Xaa Asp 
210 215 220 

ttc agg aag tat act gca ttt acc ata cct agt ata aac aat gag aca 72 0 

Phe Arg Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr 
225 230 235 240 

cca ggg att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga 7 68 

Pro Gly lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly 
245 250 255 

tea cca get ata ttc caa agt age atg aca aaa ate tta gag cct ttt 816 
Ser Pro Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe 
260 265 270 

aga aaa caa aat cca gay ata gtt ate tat caa tac atg gat gat ttg 864 
Arg Lys Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu 
275 280 285 

tat gta gga tct gay tta gaa ata gag cag cat aga ata aaa ata gag 912 
Tyr Val Gly Ser Asp Leu Glu lie Glu Gin His Arg lie Lys lie Glu 
290 295 300 

gaa ctg aga caa yat ytg tgg arg tgg ggr ttt tac aca cca gac aaa 960 
Glu Leu Arg Gin Xaa Xaa Trp Xaa Trp Xaa Phe Tyr Thr Pro Asp Lys 
305 310 315 320 

aaa cat cag aaa gaa cct cca ttc cat tgg atg ggt tat gaa etc cat 1008 
Lys His Gin Lys Glu Pro Pro Phe His Trp Met Gly Tyr Glu Leu His 
325 330 335 

cct gat aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age 1056 
Pro Asp Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser 
340 345 350 

tgg act gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca 1104 
Trp Thr Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala 
355 360 365 

agt cag att tat gca ggr 1122 
Ser Gin He Tyr Ala Xaa 
370 



<210> 79 
<211> 1116 
<212> DMA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 79 
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cct cag ate act ctt tgg caa cga ccc etc gtt aca ata aag gta ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys Val Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gac aat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asn Thr Val 
20 25 30 

ttc gaa gac ctg gat tta cca gga agg tgg aaa cca aaa atg ata ggg 144 
Phe Glu Asp Leu Asp Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aaa cag tat gag cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Glu Gin lie Pro lie 
50 55 60 

gaa ate tgt ggg cgt aaa get ata ggt aca gtg tta gta gga cct aca 24 0 

Glu lie Cys Gly Arg Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga gat ctg ttg act cag att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asp Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

eta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta ata gaa att tgt gca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu lie Glu lie Cys Ala Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aag aac agt aat aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Asn Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aag tea ata aca gta tta 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser lie Thr Val Leu 
195 200 205 



gat gtg ggt gat gca tat ttc tea gtt ccc tta gat gaa gac ttc agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 
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aag tat act gca ttt acc ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctg cca cag gga tgg aaa gga tea cca 76 8 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa att tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 
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caa aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gat tta gaa ata gag cag cat aga aca aaa ata gat gaa ctg 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Asp Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ctt acc aca cca gac cag aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Leu Thr Thr Pro Asp Gin Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gac aaa gac age tgg act 10 56 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Asp Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg ggr aaa ttg aat tgg gca agt caa 1104 
Val Asn Asp lie Gin Lys Leu Val Xaa Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 80 
<211> 1119 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
<400> 80 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gag get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata etc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu lie 
50 55 60 

gaa att tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 24 0 

Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg acw cag att ggt tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Leu Xaa Gin lie Gly Cys Thr 
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85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 no 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aar gaa ggg 4 32 

Lys lie Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca rta ttt 4 80 

Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Xaa Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 

165 170 175 

aga gaa ctt aat aag agg act caa gat ttc tgg gaa gtt caa tta gga 576 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg ttg aaa aag aaa aaa tea gta aca gta ctg 624 

He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 



gat gtg ggt gat gca tat ttc tea gtt ccc tta gat gaa gac ttc aga 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 
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aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtc ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat agg aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ttg ttg aag tgg ggg ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtg cag cct ata gtg tta ccg gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 



-130- 



Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg att 1119 
lie Tyr Pro Gly lie 
370 



<210> 81 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 81 

cct caa ate act ctt tgg caa cga ccy ctt gtt rcc ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Xaa Leu Val Xaa lie Lys lie Gly 
15 10 15 

ggg caa eta arg gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Xaa Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa ata aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu lie Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aaa cag tat gat caa ata ccy rta 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Asp Gin lie Xaa Xaa 
50 55 60 

gaa att tgt gga cat aga get ata ggt aca gtw tta gta gga cct aca 24 0 

Glu lie Cys Gly His Arg Ala lie Gly Thr Xaa Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga agr aat ctg ttg act cag att ggt tgc act 28 8 

Pro Val Asn lie lie Gly Xaa Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca ttg gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aga att ggg cct gaa aat cca tac aat act cca gta ttt 48 0 

Lys lie Ser Arg lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

get ata aag aaa aar gat agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 
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agg gaa ctt aat aag agg act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt act ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat caa tac aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cca ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtc ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Glu He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gga cag cat aga aca aaa ata gag gaa yta 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Xaa 
290 295 300 

aga gaa cat ctg tta arg tgg gga ttt ace aca cca gac aaa aag cat 960 
Arg Glu His Leu Leu Xaa Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata cag ctg cca gaa aag gaa age tgg act 1056 
Lys Trp Thr Val Gin Pro He Gin Leu Pro Glu Lys Glu Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
He Tyr Ala Gly 
370 

<210> 82 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 



-132- 



<400> 82 

cct cag ate act ctt tgg caa cga ccc etc gtc aca gta aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr Val Lys lie Gly 
15 10 15 

ggg caa tta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg agt tta cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Ser Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ctt gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu Val 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

ccc gtc aac ata att gga aga aat ctg ttg act cag att ggg tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
-85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 4 80 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aaa aag aaa gac agt act aaa tgg aga aag tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aay aaa aag act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gam ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Xaa Phe Arg 
210 215 220 

aar tat act gca ttt ace ata cct agt gta aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Val Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttt caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
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260 265 270 

caa aat cca gac ata gtt ate tay cag tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggr aag cac aga aca aaa ata gag gag eta 912 
Gly Ser Asp Leu Glu lie Xaa Lys His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga cag cat ctg ttg aag tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctk tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Xaa Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata aaa ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Lys Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gty aat gac ata cag aag tta gtg gga aaa ttr aat tgg gee agt cag 1104 
Xaa Asn Asp lie Gin Lys Leu Val Gly Lys Xaa Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 83 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 83 

cct cag ate act ctt tgg caa cga cca etc gtc gca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Ala lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac atg agt tta cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp Met Ser Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat caa gta ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Val Pro lie 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 288 
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Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tat aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg agg aaa tta gta gat ttt 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gat ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cca gca ggg tta aaa aag aaa aag tea gta aca gtg ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt acc ata ccc agt ata aac aat gag aca ccc agg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Arg 
225 230 235 240 

gtt aga tat caa tac aat gta ctt cca cag gga tgg aaa gga tea cca 768 
Val Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca tat ttc caa agt age atg aca aaa ate tta gaa ccc ttc aga aaa 816 
Ala Tyr Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aac cca gac ata gtt ate tat caa tac atg gat gac tta tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gag ata gga cag cat aga gca aaa ata gag gac eta 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Asp Leu 
290 295 300 

aga gca cat ctg ttg aag tgg ggg ttt acc aca cca gac aaa aaa cat 96 0 

Arg Ala His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttt etc tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gwg eta cca gaa aaa gac age tgg act 10 56 

Lys Trp Thr Val Gin Pro lie Xaa Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aaa tta gta gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 84 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 84 

cct caa ate act ctt tgg caa cga ccc att gtc aca ata aaa gta ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro He Val Thr He Lys Val Gly 
1 5 10 15 

ggg caa eta atg gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Met Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ata aat ttg cca gga aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Asp He Asn Leu Pro Gly Arg Trp Lys Pro Lys He He Gly 
35 40 45 

gga att ggt ggt ttt gtc aaa gtg aga cag tat gat cag gta ccc ata 192 
Gly He Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin Val Pro He 
50 55 60 

gaa ate tgt gga cat aaa gtt ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Val He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct acc aac gta gtt gga aga aat ctg atg act cag att ggc tgc acy 288 
Pro Thr Asn Val Val Gly Arg Asn Leu Met Thr Gin He Gly Cys Xaa 
85 90 95 

tta aat ttt cct att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 no 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg acg gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ctg gaa aag gat gga 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Leu Glu Lys Asp GTy 
130 135 140 



aaa att tea aaa att ggg cct gaa aat cca tat aat act cca ata ttt 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 



160 



480 



gec ata aag aaa aag aac agt gat aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asn Ser Asp Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 
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aga gaa ctt aat aar aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aat aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat ata ggt gat gca tat ttt tea att ccc tta gat aaa gac ttt agg 672 
Asp lie Gly Asp Ala Tyr Phe Ser lie Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttc acc ata cct agt ata aac aat gag aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

gtt aga tat cag tac aat gtg ctt cca cag gga tgg aag gga tea cca 768 
Val Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

fj gca ata ttc caa age age atg acc aaa ate tta gag cct ttt aga aaa 816 

;S Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 

If 260 265 270 

C cag aat cca gac ata gtt ate tgc caa tac gtg gat gat ttg tat gta 864 

k r. Gin Asn Pro Asp He Val He Cys Gin Tyr Val Asp Asp Leu Tyr Val 
'A 275 280 285 

U a gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctr 912 

yl Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Xaa 
3 290 295 300 

agg aat yat ctg tgg aag tgg gga ttt tac aca cca gac aaa aaa tat 960 
^ Arg Asn Xaa Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys Tyr 
H 305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
^ Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
W 325 330 335 

aaa tgg aca gta cag ccc ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
He Tyr Pro Gly 
370 

<210> 85 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 



-137- 



<400> 85 

cct caa ate act ctt tgg caa cga ccc etc gtc aca ata aaa gta ggg 48 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys Val Gly 
1 5 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca ggg aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag gta age ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin Val Ser He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta ata gga ccc acc 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 2 88 

Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 HO 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
n 5 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gag aag gaa ggr 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Xaa 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aar aaa aaa gac agt act aaa tgg aga aag tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aaa ara act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Xaa Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aam aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Xaa Lys Ser Val Thr Val Leu 
155 200 205 

gay gtg ggt gat gcr tat ttt tea gtt ccy tta gay aaa gay ttc agg 672 
Asp Val Gly Asp Xaa Tyr Phe Ser Val Xaa Leu Asp Lys Asp Phe Arg 



210 215 220 

aag tac aca gca ttt acc ata cct agt gta aac aat gag rca cca ggg 
Lys Tyr Thr Ala Phe Thr He Pro Ser Val Asn Asn Glu Xaa Pro Gly 
225 230 235 ~ — 



240 



720 



att aga tat cag tac aat gtg ctt cca car gga tgg aaa gga tea cca 768 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aar 816 

Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
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260 



265 



270 



maa aat cca gac ata gty ate tay caa tac atg gat gat ttr tat gta 
Xaa Asn Pro Asp lie Xaa lie Tyr Gin Tyr Met Asp Asp Xaa Tyr Val 
275 280 285 



864 



gga tct gac tta gaa ata gga cag cat aga aca aaa ata gag gaa ctg 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 



912 



aga caa cat ctg ttg cag tgg ggg tta acc aca cca gac aaa aaa cat 
Arg Gin His Leu Leu Gin. Trp Gly Leu Thr Thr Pro Asp Lys Lys His 
305 310 315 320 



960 



cag aaa gaa cct cca ttc ctt tgg atg ggg tat gaa etc cat ccg gat 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 



1008 



aaa tgg aca gta cag cct ata wtg ctg cca gac aaa gac age tgg act 
Lys Trp Thr Val Gin Pro lie Xaa Leu Pro Asp Lys Asp Ser Trp Thr 
340 345 350 



1056 



gtm aat gac ata cag aar tta gta gga aaa ttg aat tgg gcg agt cag 
Xaa Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 



1104 



ate tac cca ggg 
He Tyr Pro Gly 
370 



1116 



<210> 86 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 



<400> 86 

cct caa ate act ctt tgg caa cga ccc ate gtc aca gta aag ata ggg 
Pro Gin He Thr Leu Trp Gin Arg Pro He Val Thr Val Lys He Gly 
15 10 15 

ggg cac aca acg gaa get eta tta gat aca gga gca gat gat aca gta 
Gly His Thr Thr Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 



48 



96 



tta gaa gaa atg aat ttg cca ggg aga tgg aaa cca aaa atg ata gga 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
35 40 45 



144 



gga att gga ggt ttt ate aaa gta aga cag tat gag cag gta ccc ata 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Glu Gin Val Pro He 
50 55 60 



192 



gaa ttc tgt gga cat aaa act gta ggt aca gta tta ata gga cct aca 
Glu Phe Cys Gly His Lys Thr Val Gly Thr Val Leu He Gly Pro Thr 
65 70 75 80 



240 



cct gtc aac ata att gga aga aat ctg atg act cag att ggt tgt act 



288 
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Pro Val Asn lie lie Gly Arg Asn Leu Met Thr Gin lie Gly Cys Thr 

85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 

100 105 110 

cca gga atg gat ggg ccc aaa gtt aaa cca tgg cca ttg aca gaa aga 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Pro Trp Pro Leu Thr Glu Arg 

115 120 125 

aaa aat aaa gca tta gta gaa att tgt tec gaa atg gaa aaa gga agg 432 

Lys Asn Lys Ala Leu Val Glu lie Cys Ser Glu Met Glu Lys Gly Arq 

130 135 140 

aaa att tea aaa att ggg cct gag aat cca tac aat act cca gta ttt 480 

Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 

145 150 155 160 

gec at a aag aaa aag aac agt act aga tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Thr Arg Trp Arg Lys Leu Val Asp Phe 

165 170 175 



aga gaa ctt aat aaa aga act caa gac ttc tgg gaa gtt cag tta gga 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aac aaa tea gta aca gta ctg 
lie Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 



aga caa cat ctg ttg aag tgg ggg ttt ttc aca cca gac gaa aaa cat 
Arg Gin His Leu Leu Lys Trp Gly Phe Phe Thr Pro Asp Glu Lys His 
305 310 315 320 



576 



624 



gat gta ggt gat gca tat ttt tea gtt ccc tta gat gaa gaa ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Glu Phe Arg 
210 215 220 

aag tat act gca ttc acc ata cct agt aca aac aat gaa aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa tgt age atg aca aaa ate tta gag ccc ttt aga aaa 816 
Ala He Phe Gin Cys Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tgt cag tac atg gat gac ttg tat gta 864 
Gin Asn Pro Glu He Val He Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gca tct gat tta gaa ata ggg cag cat aga aca aaa gta gag gaa ctg 912 
Ala Ser Asp Leu Glu He Gly Gin His Arg Thr Lys Val Glu Glu Leu 
290 295 300 



960 



cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gta ctg cca gac caa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Asp Gin Asp Ser Trp Thr 
340 345 350 
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gtc aat gat ata cag aag tta gtg gga aaa tta aat tgg gca agt caa 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 



1104 



att tac cca ggg 
lie Tyr Pro Gly 
370 



1116 
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<223> 
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<400> 


87 




cct cag ate act ctt tgg caa cga ccc etc gtc 



Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Glu 



48 



10 



15 



ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 



96 



tta gaa gaa atg aat ttg tea gga aga tgg aaa cca aaa atg ata ggg 
Leu Glu Glu Met Asn Leu Ser Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 



144 



gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 



192 



gag ate tgt gga cat aaa get gta ggt aca gta tta gta gga cct aca 
Glu lie Cys Gly His Lys Ala Val Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 



240 



cct gtc aac ata att gga agr aat ctg ttg act cag att ggt tgc acc 
Pro Val Asn lie lie Gly Xaa Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 



288 



tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 



336 



cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 
Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 



384 



aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 



432 



aaa att tea aaa att ggg cct gaa aat cca tac aat act cca ata ttt 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro lie Phe 
145 150 155 160 



480 



gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat tty 
Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 



528 



-141- 



# 



aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccy gca ggg ttg aar aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Xaa Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttc tea gtt ccc tta gat gaa gay ttc aga 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

gtt aga tat car tac aat gtg ctt cca cag gga tgg aag gga tea cca 76 8 

Val Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa age age atg aca aaa ate tta gag cct ttt agg aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gat ata gtt ate tat caa tac atg gat gac ttr tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Xaa Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg car cat aga aca aaa ata gag gaa ttg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aag tgg gga tta acc aca cca gac aaa aaa cat 96 0 

Arg Gin His Leu Leu Lys Trp Gly Leu Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gat ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 



<210> 88 
<211> 1116 
<212> DNA 

<213> Human Immunoclif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
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# 



<400> 88 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag ata ggg 48 

Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
1 5 10 15 

ggg caa eta agg raa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Arg Xaa Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ata gaa ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp lie Glu Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt gtc aaa gta aga caa tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa ate tgt gga cat aaa gtt ata ggt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly His Lys Val He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga aat ctg atg act cag ctt ggt tgc act 2 88 

Pro Ala Asn He He Gly Arg Asn Leu Met Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca aaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Lys Glu 
115 120 125 

aaa ata gaa gca tta atr gaa att tgt gma ttt ttg gaa aag gaa gga 432 
Lys He Glu Ala Leu Xaa Glu He Cys Xaa Phe Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat ccg tac aac act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gga ggt act aaa tgg aga aaa ata gta gat ttc 52 8 

Ala He Lys Lys Lys Gly Gly Thr Lys Trp Arg Lys He Val Asp Phe 
165 170 175 

aga gaa ctt aat aaa aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gcg ggg tta aaa aag aay aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea att ccc tta gat gaa gaa etc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser He Pro Leu Asp Glu Glu Leu Arg 
210 215 220 

aag tat act gca ttt act ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tac caa tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttt caa agt age atg aca aaa ate tta gag ccc ttt aga aag 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
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260 265 270 

caa aat cca gac ata gtt ate twt caw tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Xaa Xaa Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg aag cat agg gaa aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Lys His Arg Glu Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg aag tgg gga ttt tac aca cca gac gaa aaa cat 960 
Arg Gin His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Glu Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat ctt gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Leu Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 10 56 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 89 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 89 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

999 caa cta aa 9 9 aa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg agt ttg cca ggg aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Ser Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga caa ttt gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Phe Asp Gin lie Pro lie 
50 55 60 

gaa ata tgt gga cac aaa get ata ggt aca gta tta ata gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga agg aat ctg ttg act cag ctt ggt tgc act 288 
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p ro val Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cvs Thr 
85 90 * | 5 

tta aat ttt ccc ate agt cct att gaa cct gta cca gta aaa tta aag 336 

Leu Asn Phe Pro He Ser Pro He Glu Pro Val Pro Val Lys Leu Lys 

100 105 iio 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 



aaa ata aaa gca tta gta gaa att tgt aca gaa ctg gaa aaa gaa ggg 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 



gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 



432 



aaa att tea aaa att ggg cct gaa aat cca tac aat act cca ata ttt 480 

Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 

165 170 175 

aga gaa ctg aat aag aaa act caa gac ttc tgg gaa gtt caa tta gga 576 

Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta acg gta ctg 624 

He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 



672 



aaa tat act gca ttt acc ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttt caa cat age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin His Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

cag aat cca gac ata gtt ate tat caa tac gtg gat gac ttg tat gta 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga gaa cat ctg ttg aag tgg gga ttt tac aca cca gac aaa aaa cat 960 
Arg Glu His Leu Leu Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gat ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 



<210> 90 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 90 

cct cag ate act ctt tgg caa cga ccc aty gtc aca ata aaa gta ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys Val Gly 
15 10 15 

gga cag eta aag gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aac ttg cca gga aaa tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ttt gtc aga gta aga caa tat gat cag gta cct gta 192 
Gly lie Gly Gly Phe Val Arg Val Arg Gin Tyr Asp Gin Val Pro Val 
50 55 60 

gaa att tgt gga cat aaa get ata ggt tea gta tta gta gga cca aca 24 0 

Glu lie Cys Gly His Lys Ala lie Gly Ser Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata att gga aga aat ctg atg act cag ctt ggt ttc act 288 
Pro Ala Asn lie lie Gly Arg Asn Leu Met Thr Gin Leu Gly Phe Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gar att tgt aca gaa ytg gaa aaa gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Xaa Glu Lys Glu Gly 
130 135 140 

aag att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aag aac agt gat aga tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Asp Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 
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aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gga ggg tta aaa aag aaa aaa tea gta aca gta eta 624 
lie Pro His Pro Gly Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 



gat gtg ggt gat gca tat ttc tea gtt ccc tta gat gaa gac ttc agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arq 

OTA o-ir- 



210 215 220 



672 



aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat car tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata tty caa agt age atg aca aaa ate tta gag cct ttt agg aag 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

maa aat cca gac ata gtt ate att caa tac atg gat gat ttg tat gtr 864 
Xaa Asn Pro Asp He Val He He Gin Tyr Met Asp Asp Leu Tyr Xaa 
275 280 285 

gga tct gat tta gaa ata gar cag cay aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Glu Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga gat cat tta ttg agg tgg ggg ttt ttc aca cca gaa caa aaa cat 960 
Arg Asp His Leu Leu Arg Trp Gly Phe Phe Thr Pro Glu Gin Lys His 

310 315 320 

cag aaa gaa cct cca ttc cat tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe His Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cat cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val His Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttr aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Xaa Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
He Tyr Ala Gly 
370 



<210> 91 
<211> 1115 
<212> DNA 

<213> Human Immunodi f ici ency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1115) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 91 

cct cag ate act ctt tgg caa cga ccc ctt gtc aca gta aag ata ggg 48 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr Val Lys lie Gly 

15 10 15 

ggg caa eta ata gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu lie Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

ttg gaa gaa atg aat ttg cca ggg aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa ate tgt gga cat aaa gtt ata rgt cca gta tta ata gga cct aca 240 
Glu He Cys Gly His Lys Val He Xaa Pro Val Leu He Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ttg atg act cag att ggc tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc ate agt cct att raa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Xaa Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aag gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 43 2 

Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa ate tea aaa att ggg cct gaa aac cca tac aat act cca gta ttt 48 0 

Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa aac agt act aga tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asn Ser Thr Arg Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gga ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Gly Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt cct eta gat gaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt ata aat aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

gtt aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga teg cca 768 
Val Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttt cag get age atg aca aaa ate tta gag ccg ttt aga aaa 816 
Ala He Phe Gin Ala Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 



-148- 



# 



260 265 270 

caa aat cca gac ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac eta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ttg ttg aaa tgg gga ttt ate aca cca gat gaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe lie Thr Pro Asp Glu Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggg tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aag tgg aca gta cag cct ata gta ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aaa tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca gg 1115 
He Tyr Ala 
370 



<210> 92 

<211> 1116 

<212> DMA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 92 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys He Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gac ata aac ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Asp He Asn Leu Pro Gly Lys Trp Lys Pro Lys Met He Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gag cag gta ccc ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Glu Gin Val Pro He 
50 55 60 

gaa ate tgt gga cat aaa act ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Thr He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg atg act cag att ggg tgc act 2 88 
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Pro Val Asn lie He Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aag aac agt act aga tgg aga aaa gta gta gat ttc 52 8 

Ala He Lys Lys Lys Asn Ser Thr Arg Trp Arg Lys Val Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttc tgg gaa gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aac aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag acg cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 76 8 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ata tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ctg gtt ate tgt caa tac atg gat gat tta tat gta 864 
Gin Asn Pro Asp Leu Val He Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac eta gaa ata ggg cag cat aga aca aaa ata gaa gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

agg caa cat ctg ttg aag tgg gga ttt acc aca cca gac gaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Glu Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag ccc ata gtg ctg cca gac aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Asp Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 



<210> 93 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 93 

cct cag ate act ctt tgg caa cga ccc ate gtc aca ata aag ata gga 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta ata gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu lie Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat tta cca gga aga tgg aca cca aaa ata ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Thr Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt ttt gtc aga gta aga cag tat gaa cag ata ccc gta 192 
Gly lie Gly Gly Phe Val Arg Val Arg Gin Tyr Glu Gin lie Pro Val 
50 55 60 

gaa ate tgc ggg cat aaa get gta ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala Val Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga aat ctg ttg act cag att ggc tgt act 2 88 

Pro Ala Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gat act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Asp Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca ara gtt aaa caa tgg cca ttg aca gaa gag 3 84 

Pro Gly Met Asp Gly Pro Xaa Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ctg gaa aag gam gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Xaa Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

get ata aag aaa aaa gac agt act aaa tgg aga aaa gta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Val Val Asp Phe 
165 170 175 
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aga gaa ctt aat aaa aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg ata maa aag aac aaa tea gta aca gta ytg 624 
He Pro His Pro Ala Gly He Xaa Lys Asn Lys Ser Val Thr Val Xaa 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gag gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tac act gca ttt acc ata cct agt aca aac aat gag aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gta ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa aty tta gag cct ttt aga aag 816 
Ala He Phe Gin Ser Ser Met Thr Lys Xaa Leu Glu Pro Phe Arg Lys 
260 265 270 

aaa aat cca gac ata rtt ate tgc caa tac atg gat gat ttg tat gta 864 
Lys Asn Pro Asp He Xaa He Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gag cag cat aga aca aaa ata gat gaa ctg 912 
Gly Ser Asp Leu Glu He Glu Gin His Arg Thr Lys He Asp Glu Leu 
290 295 300 

aga gac cat ctg tgg aag tgg gga ttt tac aca cca gac aac aaa yat 960 
Arg Asp His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Asn Lys Xaa 
305 310 315 320 

cag aaa gaa cct cca ttc cgt tgg atg ggc tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Arg Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gat age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340" 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

aat tat gca gga 1116 
Asn Tyr Ala Gly 
370 



<210> 94 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 94 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta ata gag get eta ttg gat aca gga gca gat gat aca gta 96 
Gly Gin Leu lie Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg gat ttg cca gga aga tgg aaa cca aaa ata ata ggg 144 
Leu Glu Glu Met Asp Leu Pro Gly Arg Trp Lys Pro Lys lie lie Gly 
35 40 45 

gga att gga ggt tgg ate aaa gta aga caa tat gat cag ata ccc ata 192 
Gly lie Gly Gly Trp lie Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa att tgt gga cat aaa gtt ata agt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly His Lys Val He Ser Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cca gtc aac gta att gga aga aat ctg atg act cag att ggt tgc act 288 
Pro Val Asn Val He Gly Arg Asn Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aga gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aag ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gat ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Leu Glu Lys Asp Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa gta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Val Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta ggg 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta cca aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Pro Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aaa tat act gca ttt ace ata cct agt ata aat aat gag aca cca gga 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

gtt aga tat cag tac aat gtg etc cca cag ggg tgg aaa gga tea cca 768 
Val Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg ace aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
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260 265 270 

cag aat cca aac ata ctt att tgt caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asn lie Leu lie Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gaa cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg aga tgg ggg ttt tac aca cca gat aaa aaa cat 960 
Arg Gin His Leu Trp Arg Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aag gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gag ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Glu Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gat ata cag aag tta gtg gga aaa ttg aat tgg gca agy cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Xaa Gin 
355 360 365 

att tat gca ggg 1116 
He Tyr Ala Gly 
370 

<210> 95 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 95 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys He Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga agg tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata tec gta 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Ser Val 
50 55 60 

gaa ate tgt ggr cat aaa get ata ggt aca gta tta rta gga cct aca 240 
Glu He Cys Xaa His Lys Ala He Gly Thr Val Leu Xaa Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga agg aat ttg ttg act cag att ggt tgc act 288 
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Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aaa act caa gac ttt tgg gar gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tac act gca ttt act ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat caa tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc cag tgt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Cys Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gar ata gtt ate tat caa tac atg gat gat ctg tat gta 864 
Gin Asn Pro Glu lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gaa cag cat aga ata aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg lie Lys lie Glu Glu Leu 
290 295 300 

aga cac cat ctg ttg aaa tgg gga ttt wmc aca cca gac aaa aaa cat 96 0 

Arg His His Leu Leu Lys Trp Gly Phe Xaa Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aar gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 



<210> 96 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 96 

cct caa ate act ctt tgg caa cga ccc aat gtc aca gta aag ata ggr 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Asn Val Thr Val Lys lie Xaa 
15 10 15 

ggg caa eta agg gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Arg Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa ata aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu lie Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att ggg ggt ttt ate aaa gta aga sag tat gat cag gta ccc gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Xaa Tyr Asp Gin Val Pro Val 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga ccc aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta ara tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Xaa Leu Lys 
100 105 110 

cca ggr atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Xaa Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa ate tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca ata ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro He Phe 
145 150 155 160 

gee ata aag aaa aaa gac ggt act aaa tgg aga aaa gta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Gly Thr Lys Trp Arg Lys Val Val Asp Phe 
165 170 175 
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agg gaa etc aat aag aga act caa gac ttc tgg gaa gtt caa tta ggm 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Xaa 
180 185 190 



576 



ata cca cat ccc gca ggg ttg aaa aag aaa aaa tea gtr aca gta ctg 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Xaa Thr Val Leu 
195 200 205 



624 



gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gaa ttc agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Glu Phe Arg 
210 215 220 



672 



aag tat act gca ttt acc ata cct agt gta aac aat gag aca cca ggg 
Lys Tyr Thr Ala Phe Thr lie Pro Ser Val Asn Asn Glu Thr Pro Gly 
225 230 235 240 



720 



ate aga tat caa tac aat gtg ctt cca cag gga tgg aag gga tea cca 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 



768 



gca ata ttt caa agt age atg aca aaa ate tta gag cct ttt aga aaa 
Ala lie Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 



816 



caa aat cca gac ata gtc ate tat caa tac gtg gat gat ttg tat gta 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 



864 



gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 



912 



aga caa cat ctg ttg aag tgg gga ttt acc aca cca gac aaa aaa cat 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 



960 



cag aaa gag cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 

325 330 335 

aaa tgg aca gta cag cgt ata gag ctg cca gaa aag gag age tgg act 

Lys Trp Thr Val Gin Arg He Glu Leu Pro Glu Lys Glu Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt caa 

Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 



1008 



1056 



1104 



atw tac cca ggg 
Xaa Tyr Pro Gly 
370 



1116 



<210> 97 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 97 

cct caa ate act ctt tgg caa cga ccc etc gtc aaa ata aag ata ggg 48 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Lys lie Lys lie Gly 
15 10 15 

ggg caa ata aag gaa gey tta tta gat aca gga gca gat gat aca gtg 96 
Gly Gin lie Lys Glu Xaa Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aaa tgg aaa cca aaa ttg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Leu lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ctt ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Leu He 
50 55 60 

gaa ate tgt ggc cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gee aac ata att gga aga aat ctg ttg act cag att ggt tgc act 288 
Pro Ala Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 33 6 

Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta eta gaa att tgt aca gaa ctg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Leu Glu He Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 528 
Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttt tgg gag gtt caa eta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gsa ggg tta aga aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Xaa Gly Leu Arg Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 



gat gtg ggt gat gca tat ttt tea gtt ccc tta tat gag gac tty agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Tyr Glu Asp Phe Arg 
210 215 220 



672 



aaa tat act gca ttt act ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att agg tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
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260 265 270 

caa aat cca gac ata gtt ate trt caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Xaa Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg cag tgg gga ttt ttc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Trp Gin Trp Gly Phe Phe Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gta ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 98 
<211> 1115 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1115) 

<223> Portion of HIV Reverse Transcriptase 
<400> 98 

cct caa ate act ctt tgg caa cga ccc gtc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Val Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg cat ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met His Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata cct gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro Val 
50 55 60 

gaa aty tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu Xaa Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 28 8 
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Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca ggg atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa ata tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cca gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa ttg gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca gga tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tac act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tat aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gay ata gtt att tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tec gac eta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cac ctg ttg aag tgg ggr ttt acc ack cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Xaa Phe Thr Xaa Pro Asp Lys Lys His 
305 310 315 320 

cag aag gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gta ctg cca gaa aaa gat age tgg act 105 6 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac tea gt 1115 
lie Tyr Ser 
370 

<210> 99 
<211> 1115 
<212> DMA 

<213> Human Immunodif ici ency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1115) 

<223> Portion of HIV Reverse Transcriptase 
<400> 99 

cct cag ate act ctt tgg cag cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get yta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga agr tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Xaa Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggc ttt ate aaa gta aga cag tat gat cag ata ccc eta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro Leu 
50 55 60 

gaa ate tgt ggc cat aag get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cct gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggt cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gag atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 
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aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc tea ggg tta raa aag aag aaa tea gta aca gta ctg 624 
lie Pro His Pro Ser Gly Leu Xaa Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat cca gat ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Pro Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att agg tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa age age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tac caa tac dtg gat gat ttg tak gta 864 
Gin Asn Pro Glu lie Val lie Tyr Gin Tyr Xaa Asp Asp Leu Xaa Val 
275 280 285 

rgc tct gac tta gaa ata ggg cag cat aga gca aaa ata gag gaa ctg 912 
Xaa Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aag cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtt cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca gg 1115 
lie Tyr Ala 
370 

<210> 100 
<211> 1115 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1115) 

<223> Portion of HIV Reverse Transcriptase 
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<400> 100 

cct caa ate act ctt tgg caa cga ccc eta gtc aca ata aag ata gga 48 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg cag ctr aag gaa get ata tta gat aca gga gca gat gat aca kta 9 6 

Gly Gin Xaa Lys Glu Ala lie Leu Asp Thr Gly Ala Asp Asp Thr Xaa 
20 25 30 

tta gaa gaa atg aat tng ccc gga aga tgg ama cca ama ttg ata ggg 144 
Leu Glu Glu Met Asn Xaa Pro Gly Arg Trp Xaa Pro Xaa Leu He Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 192 
Gly He Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa ate tgt gga cat aaa gtt ata ggt aca gta ttg gta gga cct aca 24 0 

Glu He Cys Gly His Lys Val He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct ace aac ata att gga aga aat ctg atg act cag ctt ggt tgc act 28 8 

Pro Thr Asn He He Gly Arg Asn Leu Met Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca ata ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr He Leu 
195 200 205 

gat gtg ggc gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aaa gta tac tgc ttt ace ata cct agt ata acc aat gag acm cca ggg 72 0 

Lys Val Tyr Cys Phe Thr He Pro Ser He Thr Asn Glu Xaa Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctg cca caa gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag ccy ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Xaa Phe Arg Lys 
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260 265 270 

caa aat cca gac ata gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg agg tgg gga ttt tac aca cca gac aaa aaa cat 960 
Arg Gin His Leu Trp Arg Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aag gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata arg ttg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Xaa Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gam ata cag aaa tta gtg gga aaa tta aat tgg gec agt cag 1104 
Val Asn Xaa lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tck cng gg 1115 
lie Xaa Xaa 
370 



<210> 101 
<211> 1096 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1096) 

<223> Portion of HIV Reverse Transcriptase 
<400> 101 

cct car ate act ctt tgg cag acc ccc ctt gtc yea ata agg aka ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Thr Pro Leu Val Xaa lie Arg Xaa Gly 
1 5 10 15 

ggr cag yta aag gaa get tta tta gay aca gra gca gat gat mca gta 96 
Xaa Gin Xaa Lys Glu Ala Leu Leu Asp Thr Xaa Ala Asp Asp Xaa Val 
20 25 30 

tta gaa gaa atg tat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Tyr Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aag gta aga cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cac aaa get ata ggt aca gta ttg gta gga tct aca 240 
Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Ser Thr 
65 70 75 80 

cct gtt aac ata att gga aga aat ctg ttg act cag att ggt tgc acc 288 
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Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt tct att gaa act gta cca gta aga tta aag 336 
Leu Asn Phe Pro lie Ser Ser lie Glu Thr Val Pro Val Arg Leu Lys 
100 105 110 

ccc gga atg gat ggc cca aaa gtt aag caa tgg cca tta aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aag aac agt gat aga tgg aga aaa gta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Asp Arg Trp Arg Lys Val Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga acc caa gac ttt tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa agg aga aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Arg Arg Lys Ser Val Thr Val Leu 
195 200 205 



gat gtg ggt gat gca tac ttt tea att ccc tta gat aaa gaa ttc aga 
Asp Val Gly Asp Ala Tyr Phe Ser lie Pro Leu Asp Lys Glu Phe Arg 
210 215 220 



672 



aag tat act gca ttt acc ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

ate aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga gaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Glu 
260 265 270 

cag aat cca gac atg gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp Met Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
2 75 280 285 

gga tct gac tta gaa ata ggg cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Ala Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga tta ttc aca cca gac caa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Leu Phe Thr Pro Asp Gin Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat ccg gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
3 2 5 330 335 

aaa tgg aca gta cag act ata gtg ctg cca gag aag gac age tgg act 1056 
Lys Trp Thr Val Gin Thr He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 
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gtc aat gac ata cag aag tta gta gga aaa ttg aat tgg g 1096 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp 
355 360 365 

<210> 102 
<211> 1048 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1048) 

<223> Portion of HIV Reverse Transcriptase 
<400> 102 

cct cag ate act ctt tgg cag cga ccc tty gtc aca ata aag gta ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Phe Val Thr lie Lys Val Gly 
15 10 15 

ggg caa eta aag gaa get eta ttg gat aca gga gca gat gat aca ata 9 6 

Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr lie 
20 25 30 

tta gaa gaa atg tgt ttg cca gga aga tgg aaa cca aaa ttg ata ggg 144 
Leu Glu Glu Met Cys Leu Pro Gly Arg Trp Lys Pro Lys Leu lie Gly 
35 40 45 

gga att gga ggt ttt gtc aaa gta aga caa tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe Val Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa gtt ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Val lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gec aac ata gtt gga aga aat ctg ttg act cag att ggc tgt act 288 
Pro Ala Asn lie Val Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggg cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ttg gag aag gat gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Asp Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tay aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa aat agt gat aaa tgg aga aaa gta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asn Ser Asp Lys Trp Arg Lys Val Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtc caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 
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ata cca cat ccc gga ggg tta rag aag aac aaa tea ata aca gta ctg 624 
lie Pro His Pro Gly Gly Leu Xaa Lys Asn Lys Ser lie Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea att ccc tta gat aaa gac ttc aga 672 

Asp Val Gly Asp Ala Tyr Phe Ser lie Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata ccy agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Xaa Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tat aat gtg ctt cca cag gga tgg aag gga tea cca 768 

lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gec ata ttc caa agt age atg aca aaa ata tta gag cct ttt aga aag 816 

Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata att ate gtt caa tac gtg gat gat ttg tat gta 864 

Gin Asn Pro Asp He He He Val Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gca tct gac tta gaa ata ggg cag cat aga aca aaa ata aag gaa eta 912 

Ala Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Lys Glu Leu 
290 295 300 



aga caa tat ctg tgg gag tgg gga ttt tac aca cca gac aaa aaa cat 
Arg Gin Tyr Leu Trp Glu Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 



960 



caa cag gaa ccc cca ttc etc tgg atg ggg tat gag etc cat cct gat 
Gin Gin Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 



1008 



aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac a 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp 
340 345 



1048 



<210> 103 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 103 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata arg rta ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Xaa Xaa Gly 
1 5 io 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Met He Gly 
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35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ccc ata 192 

Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa get gaa ggt aca gta tta gta gga cct aca 24 0 
Glu lie Cys Gly His Lys Ala Glu Gly Thr Val Leu Val Gly Pro Thr 

65 70 75 80 

ccg gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 
Pro Val Asn lie He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 

85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 

Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ctg aca gaa gaa 384 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta aba gaa att tgt aca gaa atg gaa aag gaa ggr 43 2 

Lys He Lys Ala Leu Xaa Glu He Cys Thr Glu Met Glu Lys Glu Xaa 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act ccg gta ttt 48 0 

Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 

145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 

165 170 175 

aga gaa ctt aat aag aaa act caa gac ttt tgg gaa gtt caa tta gga 5 76 

Arg Glu Leu Asn Lys Lys Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cac ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 

He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gaa ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat aca gca ttt ace ata cct agt aca aac aat gag aca ccc agg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Thr Asn Asn Glu Thr Pro Arg 

225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga teg cca 768 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 

245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 

Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tat gtg gat gat ttg tat gta 864 

Gin Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gag ata ggg cag cat aga aca aaa ata gag gaa ctg 912 

Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga saa cat ctg ttg agg tgg gga ttt ace aca cca gac aaa aaa cat 960 
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Arg Xaa His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtr cag cct ata rag ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Xaa Gin Pro lie Xaa Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aaa tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac gca gga 1116 
lie Tyr Ala Gly 
370 



<210> 104 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 

<400> 104 

cct cag ate act ctt tgg caa cga ccc mty gtc aca ata aag gta ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys Val Gly 
15 10 15 

ggg caa tta aaa gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

eta gaa gaa ata aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu lie Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat car ata cyt ata 192 
Gly lie Gly Gly Phe He Lys Val Arg Gin Tyr Asp Gin He Xaa He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttr act cag att ggc tgc act 2 88 

Pro Val Asn He He Gly Arg Asn Leu Xaa Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc ata agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 
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aaa ata aaa gca tta gya gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Xaa Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

get ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttt tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cca gca ggg eta cca agg aaa aga tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Pro Arg Lys Arg Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat cca gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Pro Asp Phe Arg 
210 215 220 

aag tat act gca ttt ace ata cct agt ata aac aat gag aca ccg ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gta ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gec ata ttc caa agt age atg aca aaa att tta gat cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys He Leu Asp Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata att ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp He He He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gca tct gac tta gaa ata ggg cag cac aga aca aaa ata gaa gaa eta 912 
Ala Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt tac aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat gca ggg 1116 
He Tyr Ala Gly 
370 

<210> 105 
<211> 1116 
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# 



<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 105 

cct cag ate act ctt tgg caa cga ccc ttc gtc gtc gta aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Phe Val Val Val Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat aat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asn Thr Val 
20 25 30 

ttt gaa gac ytg aat ttg cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Phe Glu Asp Xaa Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag gta ctt gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Val Leu Val 
50 55 60 

gaa ate tgt gga caa aaa get ata ggt aca gta tta ata gga cct aca 24 0 

Glu lie Cys Gly Gin Lys Ala lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga agg gat ctg ttg act cag att ggt tgc act 288 
Pro Val Asn lie lie Gly Arg Asp Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aar att tea aaa att ggg cct gaa aac cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aar tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gay ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
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210 215 220 

aag tat act gca ttt acc ata cct age ata aac aat gag aca cca gga 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa tgt age atg aca aaa ate tta gat cct ttt aga aag 816 
Ala lie Phe Gin Cys Ser Met Thr Lys He Leu Asp Pro Phe Arg Lys 
260 265 270 

caa aat cca gac eta gtt ate tat caa tac rtg gat gac ttg tat gta 864 
Gin Asn Pro Asp Leu Val He Tyr Gin Tyr Xaa Asp Asp Leu Tyr Val 
275 280 285 

gga tct gat tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga car cat ctg ttg aag tgg gga ttt acc aca cca gac aaa aar cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
He Tyr Pro Gly 
370 

<210> 106 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 106 

cct cag ate act ctt ngg caa cga ccm att gtc aca ata aag gta ggg 48 
Pro Gin He Thr Leu Xaa Gin Arg Xaa He Val Thr He Lys Val Gly 
15 10 15 

ggg cam tta aaa gaa gtt ytt tta gat mma gga gca gat gat cma gta 96 
Gly Xaa Leu Lys Glu Val Xaa Leu Asp Xaa Gly Ala Asp Asp Xaa Val 
20 25 30 

tta gaa gaa atr gat ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
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Leu Glu Glu Xaa Asp Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat caa ata gtt gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Val Val 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 288 
Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gag gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca ttg gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa aty ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys Xaa Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

agg gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg yta aaa aag aac aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Xaa Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttc tea gtt ccc tta gat aaa gac ttt agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt ace ata ccc agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tat aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate eta gag cct ttt agg aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gaa gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 
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aga gca cat ctg tta aag tgg gga ttt acc aca cca gay aaa aag cat 96 0 

Arg Ala His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gtg cag cct ata aag ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Lys Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gat ata cag aag tta gtg gga aaa ttg aat tgg gec agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca gga 1116 
lie Tyr Pro Gly 
370 

<210> 107 

<211> 1116 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 107 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg caa eta aag gaa get tta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg gaa ttg cca gga aga tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Glu Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta agm cag tat gat cag ata ccc ata 192 
Gly He Gly Gly Phe He Lys Val Xaa Gin Tyr Asp Gin He Pro He 
50 55 60 

gaa att tgt gga cat aaa get gtg ggt aca gta tta gta gga cct aca 24 0 

Glu He Cys Gly His Lys Ala Val Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act aag att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Lys He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 
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aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att gga cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa mgg aaa aaa tea gta aca gta ctg 624 
lie Pro His Pro Ala Gly Leu Lys Xaa Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gag ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca gga 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg yyt cca cag gga tgg aaa gga tea cca 76 8 

lie Arg Tyr Gin Tyr Asn Val Xaa Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa ata gtt ate tat cag tac atg gat gat ttg tat gta 864 
Gin Asn Pro Glu lie Val lie Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cac aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg aag tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa ccc cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg eta cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gcg agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tay gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 108 
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<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 108 

cct caa ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gtg 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg aat ttg cca ggg aaa tgg aag cca aaa atg ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggg ttt ate aaa gta agm erg tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Xaa Xaa Tyr Asp Gin lie Pro lie 
50 55 60 

gaa ate tgt gra cat aaa get aya ggt aca gta tta ata ggm cct act 240 
Glu lie Cys Xaa His Lys Ala Xaa Gly Thr Val Leu lie Xaa Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga awt ctg atg act cag att ggg tgc act 288 
Pro Val Asn lie He Gly Arg Xaa Leu Met Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag ' 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gaa gag 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
H5 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aag att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
14 5 150 155 160 

get ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga . 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggt tta aaa aag aaa aaa tea gta aca gta eta 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggg gat gca tat ttt tea gtt ccc tta gat gaa aac ttc agg 672 
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Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asn Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr lie Pro Ser lie Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gta ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa tgt age atg aca aaa ate tta gag cct ttc aga aag 816 
Ala lie Phe Gin Cys Ser Met Thr Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa atg gtt ate trc caa tac gtg gat gay ttg tat gta 864 
Gin Asn Pro Glu Met Val lie Xaa Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

ggt tct gac tta gaa ata ggg cag cat aga gca aaa ata gag gaa ctr 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Ala Lys lie Glu Glu Xaa 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa ctm cat cct gat 100 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Xaa His Pro Asp 
325 330 335 

aaa tgg aca gtg cag cat ata gaa ctg cca gaa caa gag age tgg act 1056 
Lys Trp Thr Val Gin His lie Glu Leu Pro Glu Gin Glu Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa yta aat tgg gca agy cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Xaa Asn Trp Ala Xaa Gin 
355 360 365 

att tat gca ggg 1116 
lie Tyr Ala Gly 
370 

<210> 109 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 109 

cct caa ate act ctt tgg caa cga ccc ate gtc aca gta aag ata gag 4 8 

Pro Gin lie Thr Leu Trp Gin Arg Pro lie Val Thr Val Lys lie Glu 
15 10 15 

ggg cag eta aag gaa get yta tta gat aca gga gca gat aat aca gta 96 
Gly Gin Leu Lys Glu Ala Xaa Leu Asp Thr Gly Ala Asp Asn Thr Val 
20 25 30 



-177- 



ttg gam gaa ata aat ttg cca gga aga tgg aaa cca aaa atg ata ggg 
Leu Xaa Glu lie Asn Leu Pro Gly Arg Trp Lys Pro Lys Met lie Gly 
35 40 45 



144 



gga att gra ggt ttt ate aaa gta aam cag tat gat sag ata mcc ata 
Gly lie Xaa Gly Phe lie Lys Val Xaa Gin Tyr Asp Xaa lie Xaa lie 
50 55 60 



192 



gac ate tgt gga cat aaa gta ata ggt aca ata tta gta gga cct aca 
Asp lie Cys Gly His Lys Val lie Gly Thr lie Leu Val Gly Pro Thr 
65 70 75 80 



240 



cct gtc aac ata att gga aga gat ctg ttg act cag att ggc tgc act 
Pro Val Asn lie lie Gly Arg Asp Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 



288 



tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 



336 



cca gga atg gat ggc cca aaa gtt aaa caa tgg cca ttg aca gar gaa 
Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 



384 



aaa ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gaa gga 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 



432 



aag att tea aaa att ggg cct gaa aat cca tac aac act cca gta ttt 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 



480 



gec ata aag aaa aag gac agt act aaa tgg aga aaa tta gta gat ttc 
Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 



528 



aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 



576 



ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 



624 



gat gtg ggt gat gca tat tty tea gtt ccc tta gmt aaa gaa tnn nnn 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Xaa Lys Glu Xaa Xaa 
210 215 220 



672 



nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn nnn 
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
225 230 235 240 



720 



nnn nnn nnn nnn nnn nnn nnn nnn cca cag gga tgg aaa gga tea cca 
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 



768 



gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 



816 



caa aat cca gaa ata gtt ate tac car tac rtg gat gay ttg ttw gta 
Gin Asn Pro Glu He Val He Tyr Gin Tyr Xaa Asp Asp Leu Xaa Val 
275 280 285 



864 



gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 



912 
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aga caa cat ctg ttg agg tgg gga ttt acc aca cca gac aaa aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggy tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Xaa Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 110 
<211> 1116 
<212> DNA 

<213> Human Imtnunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 110 

cyt cag ate act ctt tgg caa cga ccc cts gtc aca ata aag gta ggg 48 
Xaa Gin lie Thr Leu Trp Gin Arg Pro Xaa Val Thr lie Lys Val Gly 
15 10 15 

ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atr aat ttg cca ggr aaa tgg aaa cca awa atg ata ggg 144 
Leu Glu Glu Xaa Asn Leu Pro Xaa Lys Trp Lys Pro Xaa Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata etc ata 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu lie 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 24 0 

Glu lie Cys Gly His Lys Ala He Gly Thr Val Leu Val Gly Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 288 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca ggg atg gat ggc cca aaa gtt aag caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
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115 120 125 

aaa ata aaa gca tta ata gaa ate tgt aca gaa atg gaa aag gaa gga 432 
Lys lie Lys Ala Leu lie Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gtg aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 



gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gac ttc agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 



672 



aag tac act gca ttt mcc ata cct agt ata aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Xaa He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca caa gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa mat cca gac atg gty ate tat caa tac atg gat gat ttg tat gta 864 
Gin Xaa Pro Asp Met Xaa He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggr cag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Xaa Gin His Arg Ala Lys He Glu Glu Leu 
290 295 300 

aga cag cat ttg ttg aag tgg gga ttt acc aca cca gac aaa aag cat 96 0 

Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gag cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gag ctg cca gaa aar gam age tgg act 1056 
Lys Trp Thr Val Gin Pro He Glu Leu Pro Glu Lys Xaa Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aaa ata gtg gga aaa ttg aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys He Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
He Tyr Pro Gly 
370 
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<210> in 
<211> Ills 
<212> DNA 

<213> Human Immunodif ici 

<220> 
<221> CDS 
<222> (0) . . . (297) 
<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 111 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
1 5 io 15 



48 



ggg caa ata aag gaa get eta tta gat aca gga gca gat gat aca gta 
Gly Gin lie Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa atg age ttg cca gga aaa tgg aaa cca aaa atg ata ggg 
Leu Glu Glu Met Ser Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta agm cag tat gwt cat ata ccc ata 
Gly He Gly Gly Phe He Lys Val Xaa Gin Tyr Xaa His He Pro He 
50 55 60 



144 



192 



gaa wtc tgt ggm cat aaa get gaa ggt aca gta tta ata gga cct aca 
Glu Xaa Cys Xaa His Lys Ala Glu Gly Thr Val Leu He Gly Pro Thr 
65 70 75 80 



240 



cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgc act 
Pro Val Asn He He Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc ata agt cct att gaa act gta cca gta aga eta aaa 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Arg Leu Lys 
100 105 110 



288 



336 



cca gga atg gat ggg cca aaa gtt aag caa tgg cca eta aca gaa gaa 
Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 



384 



aaa ate aaa gca ttg ata gaa att tgt aca gaa atg gaa aag gaa gga 
Lys He Lys Ala Leu He Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 



432 



aaa att gaa aaa att ggg cct gaa aat cca tac aat act cca gta ttt 
Lys He Glu Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 



480 



gec ata agg aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 
Ala He Arg Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 



528 



aga gaa ctt aat aag aga act caa gac ttt tgg gaa att caa tta gga 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu He Gin Leu Gly 
180 185 190 



576 



ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 



624 
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gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt gta aat aat gag aca cca gga 720 
Lys Tyr Thr Ala Phe Thr He Pro Ser Val Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat caa tac aat gtg ctt cca caa gga tgg aaa gga tea cca 76 8 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gaa yta gtt ate tac caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Glu Xaa Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tea gac tta gaa ata gar aag cat aga gca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Glu Lys His Arg Ala Lys He Glu Glu Leu 
290 295 300 

aga gaa cat ctg tya aaa tgg ggg ttt acc aca cca gac aaa aaa cat 960 
Arg Glu His Leu Xaa Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttt ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag acc ata aag ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Thr He Lys Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gat ata cag aag tta gtg gga aaa ttg aat tgg gca agt caa 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat cca ggg 1116 
He Tyr Pro Gly 
370 

<210> 112 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 112 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 4 8 

Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys He Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gat aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 
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tta gaa gaa atg aat ttg cca gga aga tgg aaa cca aaa atk ata ggg 144 
Leu Glu Glu Met Asn Leu Pro Gly Arg Trp Lys Pro Lys Xaa lie Gly 
35 40 45 



gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata ctt gta 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu Val 
50 55 60 



192 



gaa att tgt gga cat aaa get ata ggt aca gta tta gta gga cct aca 24 0 

Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Gly Pro Thr 

65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gag act gta cca gta aaa tta aag 336 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtc aaa caa tgg cca ttg aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta atg gaa att tgt gca gaa wtg gaa aag gaa gga 432 

Lys lie Lys Ala Leu Met Glu He Cys Ala Glu Xaa Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 

Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 

145 150 155 160 

gee ata aag aaa aaa gac age act aaa tgg ara aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Xaa Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aar aga act caa gac ttt tgg gaa gtt caa tta gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta eta 624 

He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gac ttc agg 672 

Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aag tat act gca ttt acc ata cct agt ata aac aat gag acm cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Xaa Pro Gly 

225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 

He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa att tta gag cct ttt aga aaa 816 

Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac atg gat gat ttg tat gta 864 

Gin Asn Pro Asp He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 

Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
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290 295 300 

aga cag cat ctg ttg aag tgg gga ttk tmc aca cca gac aaa a. a a. cat 960 

Arg Gin His Leu Leu. Lys Trp Gly Xaa Xaa Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa saa cct cca ttc ctt tgg atg ggt tat gaa etc cmt cct gat 1008 

Gin Lys Xaa Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu Xaa Pro Asp 
325 330 335 

aaa tgg aca gta caa cct ata gtg ctg cca gaa aag gac age tgg act 1056 

Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttr aat tgg gca agt cag 1104 

Val Asn Asp lie Gin Lys Leu Val Gly Lys Xaa Asn Trp Ala Ser Gin 
355 360 365 

att tac gca ggg 1116 

lie Tyr Ala Gly 
370 



<210> 113 

<211> 1116 

<212> DNA 

< 2 1 3 > Human Immunodif iciency Virus (HIV) 



<220> 
<221> 
<222> 



CDS 

(0) . . . (297) 



<223> HIV Protease 



<221> CDS 

<222> (298) . . . (1116) 

<2 2 3 > Portion of HIV Reverse Transcriptase 
<400> 113 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 



10 



15 



48 



ggg caa eta aag gaa get eta tta gat aca gga gca gat gat aca gta 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 



96 



tta gaa gaa atg aat ttg cca gga aaa tgg aaa cca aaa atg ata ggg 
Leu Glu Glu Met Asn Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 



144 



gga att gga ggt ttt ate aaa gta aga cag tat gat cag ata etc ata 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin lie Leu lie 
50 55 60 



192 



gaa ate tgt gga cat aaa act ata ggt aca gta tta ata gga cct aca 
Glu lie Cys Gly His Lys Thr lie Gly Thr Val Leu lie Gly Pro Thr 
65 70 75 80 



240 



cct gtc aac ata att gga aga aat ctg ttg act cag ctt ggt tgt act 
Pro Val Asn I le lie Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 



288 



tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 
Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 



336 



cca gga atg gat ggt cca aga gtt aaa caa tgg cca ttg a cm gaa gaa 



384 
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in 



Pro Gly Met Asp Gly Pro Arg Val Lys Gin Trp Pro Leu Xaa Glu Glu 
115 120 125 



aaa ata aaa gca tta ata gaa ate tgc aca gaa atg gaa aag gam sga 
Lys He Lys Ala Leu lie Glu He Cys Thr Glu Met Glu Lys Xaa Xaa 
130 135 140 



432 



waa att tea aaa mta ggg cct gam wat cca tac aat act cca gta ttt 
Xaa He Ser Lys Xaa Gly Pro Xaa Xaa Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 



480 



gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 
Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 



528 



aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtt caa tta gga 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 



576 



ata cca cac ccg gca ggg tta aaa aag aac aaa tea gta aca gtg ttg 
He Pro His Pro Ala Gly Leu Lys Lys Asn Lys Ser Val Thr Val Leu 
195 200 205 



624 



gat gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gag ttc agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Glu Phe Arg 
210 215 220 



672 



aag tat act gca ttt acc ata cct agt ata aac aat gag aca cca ggg 
Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 



720 



ate aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 



768 



gca ata ttc caa tst age atg aca aaa ate tta gag cct ttt aga aaa 
Ala He Phe Gin Xaa Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 



816 



caa aat cca gaa ata gtt ate tgt caa tac atg gat gat ttg tat gta 
Gin Asn Pro Glu He Val He Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 



864 



gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ttg 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 



912 



aga gaa cat ctg ttg aag tgg gga ttt acc aca cca gat aaa aaa cat 
Arg Glu His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 



960 



cag aaa gaa cct cca ttc ctt tgg atg ggt tat gag etc cat cct gat 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 



1008 



aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 



1056 



gtc aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 



1104 



att tat gca ggg 
He Tyr Ala Gly 
370 



1116 
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<210> 114 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 114 

cmt caa atm amt ctt tgg car mra ccc eta gtc cna awn nmm gkk agg 48 
Xaa Gin Xaa Xaa Leu Trp Gin Xaa Pro Leu Val Xaa Xaa Xaa Xaa Arg 
15 10 15 

999 9 ca a ^t aag gaa get eta tta gac aca gga gca gat gat mca gta 96 
Gly Ala Asn Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Xaa Val 
20 25 30 

tta gaa gaa atg wat tta cca gga aaa tgg aaa cca aaa atg ata ggg 144 
Leu Glu Glu Met Xaa Leu Pro Gly Lys Trp Lys Pro Lys Met lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta agn cag tat gag cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Xaa Gin Tyr Glu Gin lie Pro lie 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta ttg gta ggm cct aca 240 
Glu lie Cys Gly His Lys Ala lie Gly Thr Val Leu Val Xaa Pro Thr 
65 70 75 80 

cct gtc aac ata att gga aga aat ctg ttg act cag att ggt tgc act 2 88 

Pro Val Asn lie lie Gly Arg Asn Leu Leu Thr Gin lie Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gtg aaa tta aag 33 6 

Leu Asn Phe Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aaa caa tgg cca tta aca gaa gaa 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa atg gaa aaa gaa ggg 432 
Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys lie Ser Lys lie Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 52 8 

Ala lie Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttc tgg gaa gtc caa tta gga 57 6 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gtg ctg 624 
lie Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 
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gac gtg ggt gat gca tat ttt tea gtt ccc tta gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tat act gca ttt tcy ata cct agt aca aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Xaa lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

agt agg tat caa tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
Ser Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg ata aaa ate tta gag cct ttt aga aaa 816 
Ala lie Phe Gin Ser Ser Met lie Lys lie Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca raa att gtg ate tat cma tac mtg gat gat ttg tat gta 864 
Gin Asn Pro Xaa lie Val lie Tyr Xaa Tyr Xaa Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata gaa cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Glu Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga caa cat ctg ttg agg tgg gga ttt acc aca cca gac aag aaa cat 960 
Arg Gin His Leu Leu Arg Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aar gaa cct ccg ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac ags ttg ret 105 6 

Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Xaa Leu Xaa 
340 345 350 

kca aat gac ata cag aag tta gtg gga aaa ttg aat tgg gca agt cag 1104 
Xaa Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac tea ggg 1116 
He Tyr Ser Gly 
370 

<210> 115 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 



<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 



<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 115 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin He Thr Leu Trp Gin Arg Pro Leu Val Thr He Lys He Gly 
15 10 15 

ggg cag eta aag gaa get eta ata gat aca gga gca gat gat aca gtg 96 
Gly Gin Leu Lys Glu Ala Leu He Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 
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tta gaa gaa atg agt ata cca gga aaa tgg aaa cca aaa ttg ata ggg 144 
Leu Glu Glu Met Ser lie Pro Gly Lys Trp Lys Pro Lys Leu lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aga cag tat gat cag gkg ccc gta 192 
Gly lie Gly Gly Phe lie Lys Val Arg Gin Tyr Asp Gin Xaa Pro Val 
50 55 60 

gaa att tgt gga cat aaa get ata ggt mca gtw tta ata ggm cct aca 240 
Glu He Cys Gly His Lys Ala He Gly Xaa Xaa Leu He Xaa Pro Thr 
65 70 75 80 

cct gec aac ata att gga agg aat ctg ttg act cag att ggt tgc act 288 
Pro Ala Asn He He Gly Arg Asn Leu Leu Thr Gin He Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aaa gtt aag caa tgg cca ttg aca gaa gag 3 84 

Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta aca gaa att tgt aca gaa atg gaa aag gaa gga 432 
Lys He Lys Ala Leu Thr Glu He Cys Thr Glu Met Glu Lys Glu Gly 
130 135 140 

aag att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 480 
Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gee ata aag aaa aaa gac agt act aaa tgg aga aaa tta gta gat ttc 528 
Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gat ttc tgg gaa gtt caa tta gga 576 
Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat cct gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc tta gat gaa gac ttt agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg 
210 215 220 

aaa tat act gca ttt ace ata cct agt gta aac aat gag aca cca ggg 72 0 

Lys Tyr Thr Ala Phe Thr He Pro Ser Val Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tat aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa tgt agt atg aca aaa ata tta gag ccc ttt aga aaa 816 
Ala He Phe Gin Cys Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac eta gtt ate tat caa tac gtg gat gat ttg tat gta 864 
Gin Asn Pro Asp Leu Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
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290 295 300 

aga caa cat ctg ttg aaa tgg ggt ttt acc aca cca gac aaa aag cat 960 
Arg Gin His Leu Leu Lys Trp Gly Phe Thr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cca gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
lie Tyr Pro Gly 
370 

<210> 116 
<211> 1116 
<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1116) 

<223> Portion of HIV Reverse Transcriptase 
<400> 116 

cct cag ate act ctt tgg caa cga ccc etc gtc aca ata aag ata ggg 48 
Pro Gin lie Thr Leu Trp Gin Arg Pro Leu Val Thr lie Lys lie Gly 
15 10 15 

ggg cag eta aag gaa get eta tta gat aca gga gca gat gac aca gta 96 
Gly Gin Leu Lys Glu Ala Leu Leu Asp Thr Gly Ala Asp Asp Thr Val 
20 25 30 

tta gaa gaa ata agt ctg cca gga aga tgg aaa cca aaa ttg ata ggg 144 
Leu Glu Glu lie Ser Leu Pro Gly Arg Trp Lys Pro Lys Leu lie Gly 
35 40 45 

gga att gga ggt ttt ate aaa gta aag cag tat gat cag ata ccc ata 192 
Gly lie Gly Gly Phe lie Lys Val Lys Gin Tyr Asp Gin lie Pro He 
50 55 60 

gaa ate tgt gga cat aaa get ata ggt aca gta tta gta ggm cct aca 24 0 

Glu He Cys Gly His Lys Ala He Gly Thr Val Leu Val Xaa Pro Thr 
65 70 75 80 

cct gtc aac ata gtt gga aga aat ctg ttg act cag ctt ggt tgc act 2 88 

Pro Val Asn He Val Gly Arg Asn Leu Leu Thr Gin Leu Gly Cys Thr 
85 90 95 

tta aat ttt ccc att agt cct att gaa act gta cca gta aaa tta aag 336 
Leu Asn Phe Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys 
100 105 110 

cca gga atg gat ggc cca aag gtt aag caa tgg cca ttg aca gaa gaa 3 84 
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Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu 
115 120 125 

aaa ata aaa gca tta gta gaa att tgt aca gaa ttg gaa aag gaa ggg 432 
Lys He Lys Ala Leu Val Glu He Cys Thr Glu Leu Glu Lys Glu Gly 
130 135 140 

aaa att tea aaa att ggg cct gaa aat cca tac aat act cca gta ttt 48 0 

Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe 
145 150 155 160 

gec ata aag aaa aaa gac agt aca aaa tgg aga aaa tta gta gat ttc 52 8 

Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe 
165 170 175 

aga gaa ctt aat aag aga act caa gac ttt tgg gaa gtt caa eta ggg 5 76 

Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly 
180 185 190 

ata cca cat ccc gca ggg tta aaa aag aaa aaa tea gta aca gta ctg 624 
He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu 
195 200 205 

gat gtg ggt gat gca tat ttt tea gtt ccc ttg gat aaa gac ttc agg 672 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 

aag tac act gca ttt ace ata cct agt ata aat aat gag aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat caa tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa agt age atg aca aaa ate tta gag cct ttt aga aaa 816 
Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tat caa tac gta gat gac ttg tat gta 864 
Gin Asn Pro Asp He Val He Tyr Gin Tyr Val Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu 
290 295 300 

aga caa cat ctg tgg aag tgg ggg ttt tac aca cca gat aaa aaa cat 96 0 

Arg Gin His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct cca ttc ctt tgg atg ggt tat gaa etc cat cct gat 1008 
Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aag gac age tgg act 1056 
Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa tta aat tgg gca agt cag 1104 
Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tac cca ggg 1116 
He Tyr Pro Gly 
370 



■190- 



<210> 117 

<211> 1119 

<212> DNA 

<213> Human Immunodif iciency Virus (HIV) 

<220> 

<221> CDS 

<222> (0) . . . (297) 

<223> HIV Protease 

<221> CDS 

<222> (298) . . . (1119) 

<223> Portion of HIV Reverse Transcriptase 
<400> 117 
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195 200 205 
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gat gtg ggt gat gca tat ttt tea gtt ccc tta gac aag gac ttt agg 
Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Lys Asp Phe Arg 
210 215 220 



672 



aaa tat act gca ttt acc ata cct agt aca aac aat gag aca cca ggg 720 
Lys Tyr Thr Ala Phe Thr lie Pro Ser Thr Asn Asn Glu Thr Pro Gly 
225 230 235 240 

att aga tat cag tac aat gtg ctt cca cag gga tgg aaa gga tea cca 768 
lie Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro 
245 250 255 

gca ata ttc caa age age atg aca aaa ate tta gat cct ttt aga aag 816 
Ala lie Phe Gin Ser Ser Met Thr Lys lie Leu Asp Pro Phe Arg Lys 
260 265 270 

caa aat cca gac ata gtt ate tgt caa tac atg gat gat ttg tat gta 864 
Gin Asn Pro Asp lie Val lie Cys Gin Tyr Met Asp Asp Leu Tyr Val 
275 280 285 

gga tct gac tta gaa ata ggg cag cat aga aca aaa ata gag gaa ctg 912 
Gly Ser Asp Leu Glu lie Gly Gin His Arg Thr Lys lie Glu Glu Leu 
290 295 300 

aga gaa cat ctg tgg aag tgg ggg ttt tac aca cca gac aaa aaa cat 96 0 

Arg Glu His Leu Trp Lys Trp Gly Phe Tyr Thr Pro Asp Lys Lys His 
305 310 315 320 

cag aaa gaa cct ccg ttc etc tgg atg ggt tat gaa etc cat cct gat 10 0 8 

Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp 
325 330 335 

aaa tgg aca gta cag cct ata gtg ctg cca gaa aaa gac age tgg act 1056 
Lys Trp Thr Val Gin Pro lie Val Leu Pro Glu Lys Asp Ser Trp Thr 
340 345 350 

gtc aat gac ata cag aag tta gtg gga aaa ttg aac tgg gca agt cag 1104 
Val Asn Asp lie Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin 
355 360 365 

att tat yea ggg att 1119 
lie Tyr Xaa Gly lie 
370 



<210> 118 

<211> 979 

<212> PRT 

<213> Human Immunodif iciency Virus 
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Ala Phe Thr lie 
130 

Gin Tyr Asn Val 
14 5 

Gin Ser Ser Met 

Glu He Val He 
180 

Leu Glu He Gly 
195 

Leu Leu Lys Trp 
210 

Pro Pro Phe Leu 
225 

Val Gin Pro He 

He Gin Lys Leu 
260 

Gly He Lys Val 

275 

Leu Thr Glu Val 
290 

Glu Asn Arg Glu 
305 

Pro Ser Lys Asp 

Trp Thr Tyr Gin 
340 

Lys Tyr Ala Arg 
355 

Thr Glu Ala Val 
370 

Lys Thr Pro Lys 
385 

Trp Trp Thr Glu 

Val Asn Thr Pro 
420 

Pro He Val Gly 
435 

Glu Thr Lys Leu 
450 

Lys Val Val Thr 
465 

Ala He Tyr Leu 

Thr Asp Ser Gin 
500 

Ser Glu Ser Glu 
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Glu Lys Val Tyr 
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Pro He Glu Thr 
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Thr Arg He Leu 
165 

Tyr Gin Tyr Met 
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Trp Met Gly Tyr 
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Val Gly Lys Leu 

Arg Gin Leu Cys 
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Leu He Ala Glu 
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535 

Val Pro Val Lys 
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Asn Glu Thr Pro 
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Gly He Arg Tyr 

Pro Ala He Phe 
160 

Lys Gin Asn Pro 
175 

Val Gly Ser Asp 
190 

Leu Arg Gly His 
205 

His Gin Lys Glu 

Asp Lys Trp Thr 
240 

Thr Val Asn Asp 
255 

Gin He Tyr Ala 
270 

Gly Thr Lys Ala 
285 

Leu Glu Leu Ala 

Val Tyr Tyr Asp 
320 

Gly Gin Gly Gin 
335 

Leu Lys Thr Gly 
350 

Val Lys Gin Leu 
365 

Val He Trp Gly 

Thr Trp Glu Thr 
400 

Glu Trp Glu Phe 
415 

Leu Glu Lys Glu 
430 

Ala Ala Asn Arg 
445 

Arg Gly Arg Gin 

Thr Glu Leu Gin 
480 

Val Asn He Val 
495 

Gin Pro Asp Gin 
510 

Leu He Lys Lys 
525 

Gly He Gly Ser 

Met Asp Gly Pro 
560 

Lys Ala Leu Val 
575 

Ser Lys He Gly 
590 

Lys Lys Lys Asp 
605 

Leu Asn Lys Arg 

His Pro Ala Gly 
640 

Gly Asp Ala Tyr 
655 

Thr Ala Phe Thr 



He 


Pro 


Ser 


Arg 






675 




Val 


Leu 


Pro 


Gin 




690 






Met 


Thr 


Arg 


He 


705 








He 


Tyr 


Gin 


Tyr 


Gly 


Gin 


His 


Arg 








740 


Trp 


Gly 


Phe 


Thr 






755 




Leu 


Trp 


Met 


Gly 




770 






He 


Lys 


Leu 


Pro 


785 








Leu 


Val 


Gly 


Lys 


Val 


Arg 


Gin 


Leu 








820 


Val 


He 


Pro 


Leu 






835 




Glu 


He 


Leu 


Lys 




850 






Asp 


Leu 


He 


Ala 


865 








Gin 


He 


Tyr 


Gin 


Arg 


Met 


Arg 


Gly 








900 


Val 


Gin 


Lys 


He 






yib 




Lys 


Phe 


Lys 


Leu 




930 






Glu 


Tyr 


Trp 


Gin 


945 








Pro 


Pro 


Leu 


Val 


Gly 


Ala 


Glu 





Asn 


Asn 


CjlU 


Thr 








680 


Gly 


Trp 


Lys 


Gly 






b y z> 




Leu 


CjIU 


Pro 


i*ne 




/in 






Met 


Asp 


Asp 


Leu 










Aia 


Lys 


Tic 


Kj _L U. 


Thr 


Pro 


Asp 


Lys 








760 


Tyr 


Glu 


Leu 


His 






17 S 




Glu 


Lys 


Asp 


Ser 




790 






Leu 


Asn 


Trp 


Ala 


8 05 








Cys 


Lys 


Leu 


Leu 


Thr 


Glu 


(j-LU 


Aia 








840 


Glu 


Pro 


Val 


His 






855 




Glu 


He 


Gin 


Lys 




O 1 c\ 






Glu 


Pro 


Phe 


Lys 


bob 








Ala 


ill s 


Thr 


Asn 


Thr 


Thr 


Glu 


Ser 








920 


Pro 


He 


Gin 


Lys 






935 




Ala 


Thr 


Trp 


He 




950 






Lys 


Leu 


Trp 


Tyr 


965 









665 

Pro Gly He Arg 

Ser Pro Ala He 
700 

Arg Lys Gin Asn 
715 

Tyr Val Gly Ser 
730 

Glu Leu Arg Gly 
745 

Lys His Gin Lys 

Pro Asp Lys Trp 
780 

Trp Thr Val Asn 
795 

Ser Gin He Tyr 
810 

Arg Gly Thr Lys 
825 

Glu Leu Glu Leu 

Gly Val Tyr Tyr 
860 

Gin Gly Gin Gly 
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Asn Leu Lys Thr 
890 

Asp Val Lys Gin 
905 

He Val He Trp 

Glu Thr Trp Glu 
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Pro Glu Trp Glu 
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Gin Leu Glu Lys 
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Tyr Gin Tyr Asn 
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Phe Gin Ser Ser 

Pro Glu He Val 
720 

Asp Leu Glu He 
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Ala Gly He Lys 
815 

Ala Leu Thr Glu 
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Ala Glu Asn Arg 
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Asp Pro Ser Lys 

Gin Trp Thr Tyr 
880 

Gly Lys Tyr Ala 
895 

Leu Thr Glu Ala 
910 

Gly Lys Thr Pro 
925 

Thr Trp Trp Thr 

Phe Val Asn Thr 
960 

Glu Pro He Val 
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